if(!require(liver)) install.packages("liver")
if(!require(ggplot2)) install.packages("ggplot2")
if(!require(pROC)) install.packages("pROC")
if(!require(dplyr)) install.packages("dplyr")
if(!require(readr)) install.packages("readr")
if(!require(tidyr)) install.packages("tidyr")
if(!require(kableExtra)) install.packages("kableExtra")
if(!require(showtext)) install.packages("showtext")
if(!require(grid)) install.packages("grid")
if(!require(naivebayes)) install.packages("naivebayes")
if(!require(psych)) install.packages("psych")
if(!require(ggcorrplot)) install.packages("ggcorrplot")
# Load libraries
library(liver)
library(ggplot2)
library(pROC)
library(dplyr)
library(readr)
library(tidyr)
library(kableExtra)
library(showtext)
library(grid)
library(naivebayes)
library(psych)
library(ggcorrplot)
In recent years, Environmental, Social, and Governance (ESG) scores have played a central role in helping investors assess corporate responsibility and sustainability. However, the hype around ESG scores and their purported significance in demonstrating a company’s commitment to sustainability is increasingly being called into question. According to a report by the Cambridge Institute for Sustainability Leadership (Hooper & Gilding, 2024), as cited in a [Financial Times article] (https://on.ft.com/3MQZX7u), several reasons have emerged prompting a reconsideration of current approaches to corporate sustainability. The report argues that while ESG has fostered awareness and action towards corporate responsibility, it has ultimately failed to deliver on the transformative changes needed to address global sustainability issues. The report highlights that, despite progress in areas like renewable energy, businesses are still contributing to environmental degradation at unsustainable rates. This is largely due to the tension between short-term profitability and long-term sustainability, a conflict that ESG initiatives alone could not resolve. Therefore, the authors of the report call for a transformative shift from traditional ESG frameworks to what they term “competitive sustainability”. This model would require businesses to not only set internal sustainability goals but also drive structural changes across entire markets. This approach recognizes that no business can thrive in an unsustainable market, and incremental efforts at corporate sustainability will remain inadequate without systemic shifts. However, it goes without saying that this shift will require stronger policy interventions, including government actions that align market incentives with sustainability goals. Unfortunately, this transformative shift towards “competitive sustainability” is exceedingly difficult to materialize in today’s global political landscape. Many governments are increasingly prioritizing short-term economic output over long-term sustainability commitments. The war in Ukraine and its ensuing geopolitical instability have led numerous countries to shift their focus towards defense initiatives and energy security, rather than accelerating their climate goals. Additionally, the economic recovery from the COVID-19 pandemic has prompted some nations to prioritize immediate economic growth and stability, often at the expense of sustainability initiatives. This political environment creates a serious challenge for implementing the systemic changes. As governments face pressure to maintain economic output, they may become reluctant to introduce or enforce stricter sustainability regulations, despite the clear need for them to align market incentives with environmental and social goals. However, the introduction of frameworks like the Corporate Sustainability Reporting Directive (CSRD), part of the European Green Deal, offers a glimmer of hope. The CSRD requires companies to provide detailed reports on how their operations impact the environment, as well as their plans to mitigate these effects (Corporate Sustainability Reporting, n.d.). It aims to foster greater transparency and accountability, pushing businesses to align more closely with sustainability goals while avoiding attempts for greenwashing. Even though, it is primarily a European initiative, the CSRD sets a precedent for the kind of government intervention needed to drive the “competitive sustainability” model advocated in the report. By mandating that companies disclose non-financial metrics such as their environmental footprint, the CSRD helps to create market pressure that could eventually lead to broader, systemic shifts. The growing awareness of the shortcomings of ESG metrics, particularly with regard to greenwashing, and the fragility of global systems, suggests that we are indeed moving beyond ESG scores as the primary tool for evaluating corporate responsibility. While ESG scores may still have limited value, they are increasingly seen as insufficient for measuring the resilience of businesses in today’s volatile political and economic landscape. Thus, given this evolving context, a key business problem arises in how companies should navigate these uncertain times when it comes to corporate sustainability and whether ESG scores still offer value in evaluating sustainability or economic performance, or reliance should shift entirely to more detailed reporting frameworks like the CSRD. Analyzing the relationship between ESG scores and stock performance in this context can help businesses understand whether ESG scores still provide meaningful insights into financial performance. By addressing this problem, businesses can develop strategies that meet both investor expectations and regulatory demands, ensuring that their sustainability reporting is both effective and transparent. As existing literature shows, research has been conducted on the potential financial benefits of ESG scores; however, the results remain mixed and inconclusive, leaving room for further investigation to fully understand their impact on stock performance. (Ademi & Klungseth, 2022, Darolles et al., 2023)
Research Question: To what extent do ESG scores predict financial performance and market stability for companies within the S&P 500, and how useful are these scores in driving sustainable growth and investor confidence, especially as they face increasing criticism for their reliability and effectiveness?
The raw data used in this project was found on Kaggle, a popular platform for data science projects, which provides a variety of datasets on diverse topics. The datasets utilized for this analysis was contributed by a Kaggle user and includes S&P 500 stock prices in a timeseries format and ESG (Environmental, Social, Governance) scores with other financial metrics. The ESG data, provided in the dataset, was collected to give an overview of corporate sustainability practices and their relationship with financial performance. The stock price data, spans from 2023 to 2024, offering an up-to-date perspective on stock movements for the largest public companies in the U.S.
While the dataset provides robust financial and sustainability data, in order to enrich the analysis, additional variables were derived from the stock price data. For instance, a measure of stock price volatility was calculated based on daily fluctuations, and log returns were used to assess average stock performance. The S&P 500 index was chosen because it represents the largest 500 publicly traded companies in the U.S., serving as a barometer of the overall economic health of the country.
See the python code for scraping the data from the web here: S&P 500 ESG and Stocks Data 2023-24
esg_data and price_data datasetsBefore delving into statistical analysis and building predictive models, the dataset components were thoroughly inspected, and the CSV files were imported into R for data preprocessing and exploration.
setwd("/Users/todomonkos/Documents/University/Year 4/Data Wrangling/Assignments/Final Project")
# Load the datasets
esg_data <- read.csv("sp500_esg_data.csv")
price_data <- read.csv("sp500_price_data.csv")
esg_data datasetesg_data dataset:
| Symbol | Full.Name | GICS.Sector | GICS.Sub.Industry | environmentScore | socialScore | governanceScore | totalEsg | highestControversy | percentile | ratingYear | ratingMonth | marketCap | beta | overallRisk |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A | Agilent Technologies | Health Care | Life Sciences Tools & Services | 1.12 | 6.42 | 6.10 | 13.64 | 2 | 7.98 | 2023 | 9 | 3.975183e+10 | 1.054 | 8 |
| AAL | American Airlines Group | Industrials | Passenger Airlines | 9.94 | 11.65 | 4.76 | 26.35 | 2 | 54.48 | 2023 | 9 | 7.335384e+09 | 1.433 | 10 |
| AAPL | Apple Inc. | Information Technology | Technology Hardware, Storage & Peripherals | 0.46 | 7.39 | 9.37 | 17.22 | 3 | 17.82 | 2023 | 9 | 3.296097e+12 | 1.240 | 1 |
| ABBV | AbbVie | Health Care | Biotechnology | 2.38 | 17.19 | 10.36 | 29.93 | 3 | 68.62 | 2023 | 9 | 3.416984e+11 | 0.619 | 6 |
| ABT | Abbott Laboratories | Health Care | Health Care Equipment | 2.27 | 14.24 | 8.33 | 24.83 | 3 | 48.27 | 2023 | 9 | 2.055344e+11 | 0.722 | 7 |
| ACGL | Arch Capital Group | Financials | Property & Casualty Insurance | 1.47 | 10.10 | 10.89 | 22.46 | 2 | 37.87 | 2023 | 9 | 4.285557e+10 | 0.596 | 5 |
Summary of the esg_data dataset:
## Symbol Full.Name GICS.Sector GICS.Sub.Industry
## Length:426 Length:426 Length:426 Length:426
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## environmentScore socialScore governanceScore totalEsg
## Min. : 0.000 Min. : 0.760 Min. : 2.960 Min. : 7.08
## 1st Qu.: 1.780 1st Qu.: 6.662 1st Qu.: 5.253 1st Qu.:16.41
## Median : 4.085 Median : 8.905 Median : 6.085 Median :21.07
## Mean : 5.784 Mean : 9.071 Mean : 6.701 Mean :21.56
## 3rd Qu.: 8.992 3rd Qu.:11.213 3rd Qu.: 7.638 3rd Qu.:26.02
## Max. :24.980 Max. :22.480 Max. :19.430 Max. :41.66
## highestControversy percentile ratingYear ratingMonth
## Min. :0.000 Min. : 1.27 Min. :2021 Min. :3.00
## 1st Qu.:1.000 1st Qu.:15.32 1st Qu.:2023 1st Qu.:9.00
## Median :2.000 Median :32.15 Median :2023 Median :9.00
## Mean :1.883 Mean :35.77 Mean :2023 Mean :8.96
## 3rd Qu.:2.000 3rd Qu.:53.23 3rd Qu.:2023 3rd Qu.:9.00
## Max. :5.000 Max. :93.23 Max. :2023 Max. :9.00
## marketCap beta overallRisk
## Min. :6.389e+09 Min. :-0.0570 Min. : 1.000
## 1st Qu.:2.054e+10 1st Qu.: 0.7492 1st Qu.: 3.000
## Median :4.029e+10 Median : 1.0295 Median : 5.000
## Mean :1.094e+11 Mean : 1.0359 Mean : 5.286
## 3rd Qu.:8.616e+10 3rd Qu.: 1.2805 3rd Qu.: 8.000
## Max. :3.296e+12 Max. : 3.2430 Max. :10.000
Structure of the esg_data dataset:
## 'data.frame': 426 obs. of 15 variables:
## $ Symbol : chr "A" "AAL" "AAPL" "ABBV" ...
## $ Full.Name : chr "Agilent Technologies" "American Airlines Group" "Apple Inc." "AbbVie" ...
## $ GICS.Sector : chr "Health Care" "Industrials" "Information Technology" "Health Care" ...
## $ GICS.Sub.Industry : chr "Life Sciences Tools & Services" "Passenger Airlines" "Technology Hardware, Storage & Peripherals" "Biotechnology" ...
## $ environmentScore : num 1.12 9.94 0.46 2.38 2.27 ...
## $ socialScore : num 6.42 11.65 7.39 17.19 14.24 ...
## $ governanceScore : num 6.1 4.76 9.37 10.36 8.33 ...
## $ totalEsg : num 13.6 26.4 17.2 29.9 24.8 ...
## $ highestControversy: num 2 2 3 3 3 2 2 2 1 3 ...
## $ percentile : num 7.98 54.48 17.82 68.62 48.27 ...
## $ ratingYear : num 2023 2023 2023 2023 2023 ...
## $ ratingMonth : num 9 9 9 9 9 9 9 9 9 9 ...
## $ marketCap : num 3.98e+10 7.34e+09 3.30e+12 3.42e+11 2.06e+11 ...
## $ beta : num 1.054 1.433 1.24 0.619 0.722 ...
## $ overallRisk : int 8 10 1 6 7 5 2 1 9 4 ...
After inspecting the esg_data dataset, we can see that
highestControversy and overallRisk are not
indicated as their true variable type. Therefore, we have to change
this:
esg_data$highestControversy <- factor(esg_data$highestControversy, levels = 0:10, ordered = TRUE)
esg_data$overallRisk <- factor(esg_data$overallRisk, levels = 1:10, ordered = TRUE)
str(esg_data)
## 'data.frame': 426 obs. of 15 variables:
## $ Symbol : chr "A" "AAL" "AAPL" "ABBV" ...
## $ Full.Name : chr "Agilent Technologies" "American Airlines Group" "Apple Inc." "AbbVie" ...
## $ GICS.Sector : chr "Health Care" "Industrials" "Information Technology" "Health Care" ...
## $ GICS.Sub.Industry : chr "Life Sciences Tools & Services" "Passenger Airlines" "Technology Hardware, Storage & Peripherals" "Biotechnology" ...
## $ environmentScore : num 1.12 9.94 0.46 2.38 2.27 ...
## $ socialScore : num 6.42 11.65 7.39 17.19 14.24 ...
## $ governanceScore : num 6.1 4.76 9.37 10.36 8.33 ...
## $ totalEsg : num 13.6 26.4 17.2 29.9 24.8 ...
## $ highestControversy: Ord.factor w/ 11 levels "0"<"1"<"2"<"3"<..: 3 3 4 4 4 3 3 3 2 4 ...
## $ percentile : num 7.98 54.48 17.82 68.62 48.27 ...
## $ ratingYear : num 2023 2023 2023 2023 2023 ...
## $ ratingMonth : num 9 9 9 9 9 9 9 9 9 9 ...
## $ marketCap : num 3.98e+10 7.34e+09 3.30e+12 3.42e+11 2.06e+11 ...
## $ beta : num 1.054 1.433 1.24 0.619 0.722 ...
## $ overallRisk : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 8 10 1 6 7 5 2 1 9 4 ...
price_data datasetHead of the price_data dataset:
my_kable(head(price_data))
| Date | A | AAL | AAPL | ABBV | ABT | ACGL | ACN | ADBE | ADI | ADM | ADP | ADSK | AEE | AEP | AES | AFL | AIG | AIZ | AJG | AKAM | ALB | ALL | ALLE | AMAT | AME | AMGN | AMP | AMT | AMZN | ANET | ANSS | AOS | APA | APD | APH | APTV | ARE | ATO | AVB | AVGO | AVY | AWK | AXP | AZO | BA | BAC | BALL | BAX | BBWI | BBY | BDX | BEN | BG | BIIB | BIO | BK | BKNG | BLK | BMY | BR | BRO | BSX | BWA | BX | BXP | C | CAG | CAH | CAT | CB | CBOE | CBRE | CCI | CCL | CDNS | CDW | CE | CF | CFG | CHD | CHRW | CHTR | CI | CINF | CL | CLX | CMCSA | CME | CMG | CMI | CMS | CNC | CNP | COF | COO | COP | COR | COST | CPB | CPRT | CPT | CRM | CSCO | CSGP | CSX | CTAS | CTRA | CTSH | CVS | CVX | D | DAL | DD | DE | DFS | DG | DGX | DHI | DHR | DIS | DLR | DLTR | DOV | DPZ | DRI | DTE | DUK | DVA | DVN | DXCM | EA | EBAY | ECL | ED | EFX | EG | EIX | EL | ELV | EMN | EMR | EOG | EQIX | EQR | EQT | ES | ESS | ETN | ETR | EW | EXC | EXPD | EXPE | EXR | F | FAST | FCX | FDS | FDX | FE | FFIV | FI | FIS | FITB | FMC | FRT | FTNT | FTV | GD | GE | GEN | GILD | GIS | GL | GLW | GM | GOOGL | GPC | GPN | GRMN | GS | GWW | HAL | HAS | HBAN | HCA | HD | HES | HIG | HLT | HOLX | HON | HPE | HPQ | HRL | HSIC | HST | HSY | HUBB | HUM | IBM | ICE | IDXX | IEX | IFF | INCY | INTC | INTU | IP | IPG | IQV | IRM | ISRG | IT | ITW | IVZ | J | JBHT | JCI | JKHY | JNJ | JNPR | JPM | K | KEY | KEYS | KHC | KIM | KLAC | KMB | KMI | KMX | KO | KR | L | LDOS | LEN | LH | LKQ | LLY | LMT | LNT | LOW | LRCX | LULU | LUV | LVS | LYB | MA | MAA | MAR | MAS | MCD | MCHP | MCK | MCO | MDLZ | MDT | MET | META | MGM | MHK | MKC | MLM | MMC | MMM | MNST | MO | MOS | MPC | MRK | MRO | MS | MSCI | MSFT | MSI | MTB | MTD | MU | NCLH | NDAQ | NDSN | NEE | NEM | NFLX | NI | NKE | NOC | NOW | NRG | NSC | NTAP | NTRS | NUE | NVDA | NVR | NWSA | NXPI | O | OKE | OMC | ON | ORCL | ORLY | OXY | PANW | PARA | PAYX | PCAR | PCG | PEG | PEP | PFE | PFG | PG | PGR | PH | PHM | PKG | PLD | PM | PNC | PNR | PNW | PPG | PPL | PRU | PSA | PSX | PTC | PWR | PYPL | QCOM | QRVO | RCL | REG | REGN | RF | RJF | RL | RMD | ROK | ROL | ROP | ROST | RSG | RVTY | SBAC | SBUX | SCHW | SHW | SJM | SLB | SNA | SNPS | SO | SPG | SPGI | SRE | STLD | STT | STX | STZ | SWK | SWKS | SYF | SYK | SYY | T | TAP | TDG | TEL | TER | TFX | TGT | TJX | TMO | TMUS | TPR | TRGP | TRMB | TROW | TRV | TSCO | TSLA | TSN | TT | TXN | TXT | UAL | UDR | UHS | ULTA | UNH | UNP | UPS | URI | USB | V | VLO | VMC | VRSK | VRSN | VRTX | VTR | VZ | WAB | WAT | WBA | WDC | WEC | WELL | WFC | WM | WMB | WMT | WRB | WTW | WY | WYNN | XEL | XOM | XYL | YUM | ZBH | ZTS |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2023-01-03 00:00:00+00:00 | 150.04 | 12.74 | 125.07 | 162.38 | 109.58 | 62.46 | 270.26 | 336.92 | 162.45 | 89.65 | 237.66 | 185.15 | 88.49 | 94.87 | 28.11 | 71.58 | 62.93 | 127.32 | 187.26 | 84.97 | 214.50 | 136.66 | 107.70 | 96.73 | 140.52 | 261.65 | 310.66 | 214.67 | 85.82 | 120.91 | 238.64 | 58.92 | 43.62 | 306.57 | 38.200 | 92.95 | 144.11 | 111.20 | 161.50 | 55.348 | 183.21 | 154.10 | 147.12 | 2431.06 | 195.39 | 33.51 | 51.97 | 51.04 | 43.19 | 80.41 | 256.18 | 27.12 | 95.57 | 272.63 | 422.43 | 46.30 | 2032.21 | 712.04 | 72.26 | 134.56 | 56.65 | 46.01 | 35.44894 | 76.13 | 66.62 | 45.78 | 38.83 | 76.73 | 238.88 | 220.77 | 126.33 | 78.44 | 138.31 | 7.97 | 159.64 | 179.51 | 102.27 | 81.95 | 39.53 | 82.02 | 90.28 | 341.58 | 321.48 | 104.01 | 79.13 | 142.26 | 35.57 | 168.95 | 27.4226 | 241.04 | 63.73 | 79.74 | 29.56 | 92.97 | 83.1700 | 113.12 | 164.29 | 453.28 | 56.20 | 30.730 | 110.68 | 134.78 | 47.94 | 77.65 | 30.88 | 112.2475 | 23.21 | 57.59 | 92.91 | 173.99 | 62.97 | 32.61 | 68.86 | 424.29 | 96.99 | 246.72 | 155.77 | 90.68 | 232.5089 | 88.97 | 102.83 | 140.26 | 135.07 | 339.27 | 139.96 | 117.86 | 103.70 | 74.84 | 58.12 | 114.73 | 122.80 | 42.15 | 147.21 | 95.76 | 198.31 | 333.38 | 64.28 | 253.91 | 502.20 | 82.13 | 96.11 | 124.45 | 662.51 | 58.84 | 31.82 | 84.22 | 211.31 | 157.75 | 109.13 | 74.27 | 43.16 | 104.98 | 87.68 | 144.29 | 11.68 | 47.40 | 37.92 | 405.96 | 177.27 | 42.03 | 144.85 | 101.21 | 68.54 | 32.64 | 124.92 | 102.66 | 48.52 | 64.27 | 248.60 | 52.94397 | 21.62 | 85.39 | 83.22 | 119.59 | 32.54 | 33.82 | 89.12 | 169.81 | 100.20 | 93.57 | 346.22 | 555.93 | 37.66 | 61.95 | 14.10 | 243.21 | 315.91 | 134.52 | 75.35 | 125.03 | 75.55 | 214.30 | 16.06 | 26.75 | 45.75 | 79.60 | 15.93 | 226.93 | 234.49 | 500.49 | 141.55 | 103.23 | 406.01 | 229.49 | 105.64 | 79.55 | 26.73 | 391.18 | 35.41 | 33.79 | 203.86 | 49.67 | 265.63 | 337.49 | 220.32 | 18.22 | 121.29 | 172.60 | 64.74 | 176.10 | 178.19 | 32.53 | 135.12 | 66.92019 | 17.61 | 170.94 | 40.78 | 21.24 | 376.55 | 137.11 | 18.01 | 60.64 | 62.95 | 44.47 | 58.59 | 104.71 | 91.98 | 204.2784 | 53.86 | 364.99 | 477.53 | 55.37 | 199.04 | 414.30 | 323.38 | 32.60 | 49.33 | 83.99 | 346.80 | 155.42 | 147.75 | 47.42 | 264.33 | 69.10 | 374.33 | 276.47 | 66.25 | 78.26 | 72.15 | 124.74 | 33.28 | 106.86 | 83.47 | 338.19 | 165.98 | 102.3997 | 50.660 | 45.52 | 42.63 | 110.92 | 111.14 | 25.64 | 85.72 | 461.16 | 239.58 | 257.40 | 144.97 | 1461.87 | 50.37 | 11.86 | 61.76 | 238.03 | 83.83 | 49.58 | 294.95 | 27.44 | 118.75 | 540.33 | 385.50 | 31.93 | 247.70 | 60.79 | 89.78 | 131.06 | 14.315 | 4589.32 | 18.30 | 155.05 | 63.80 | 63.94 | 82.53 | 61.61 | 83.72 | 840.72 | 61.05 | 138.45 | 17.07 | 115.81 | 65.62000 | 15.68 | 62.05 | 179.41 | 51.26 | 83.77 | 151.57 | 130.14 | 291.76 | 46.20 | 129.06 | 112.69 | 101.30 | 159.52 | 45.20 | 74.63 | 126.84 | 29.20 | 99.48 | 274.18 | 101.03 | 119.65 | 140.61 | 74.58 | 107.20 | 89.14 | 48.71 | 62.83 | 720.47 | 21.47 | 106.11 | 109.11 | 209.11 | 261.42 | 36.55 | 434.46 | 115.83 | 128.68 | 141.14 | 281.74 | 100.83 | 81.94 | 239.37 | 158.99 | 51.50 | 229.38 | 319.68 | 71.90 | 117.54 | 335.17 | 76.855 | 95.17 | 78.40 | 51.88 | 227.60 | 76.19 | 90.50 | 32.39 | 246.03 | 75.99 | 18.74 | 49.44 | 625.77 | 114.66 | 85.93 | 249.85 | 151.73 | 79.01 | 553.18 | 139.04 | 38.83 | 70.65 | 50.60 | 109.15 | 187.12 | 223.18 | 108.10 | 63.60 | 171.53 | 163.21 | 70.26 | 37.21 | 38.59 | 142.61 | 471.63 | 518.64 | 207.58 | 175.28 | 356.71 | 44.64 | 207.39 | 120.04 | 176.10 | 177.34 | 201.92 | 286.02 | 45.37 | 40.12 | 99.76 | 345.59 | 37.19 | 31.41 | 94.16 | 66.95 | 41.79 | 157.03 | 32.15 | 47.86667 | 48.38667 | 245.18 | 30.86 | 85.61 | 70.07 | 106.51 | 110.76 | 126.96 | 127.28 | 146.85 |
| 2023-01-04 00:00:00+00:00 | 151.67 | 13.59 | 126.36 | 163.69 | 111.21 | 62.77 | 269.34 | 341.41 | 165.91 | 86.39 | 238.78 | 187.96 | 88.96 | 95.52 | 27.28 | 71.92 | 63.86 | 129.09 | 189.74 | 85.86 | 223.57 | 138.96 | 109.24 | 99.31 | 140.85 | 264.39 | 312.88 | 219.00 | 85.14 | 113.84 | 242.94 | 60.66 | 42.15 | 306.59 | 38.765 | 95.88 | 146.89 | 111.94 | 164.32 | 56.024 | 186.79 | 157.17 | 150.54 | 2445.46 | 203.64 | 34.14 | 52.49 | 52.33 | 47.73 | 81.94 | 258.09 | 27.83 | 95.00 | 270.81 | 433.70 | 47.44 | 2110.44 | 721.90 | 72.62 | 136.67 | 57.60 | 46.50 | 36.62852 | 78.01 | 66.71 | 46.96 | 38.65 | 77.00 | 241.36 | 224.40 | 124.47 | 79.72 | 143.35 | 8.74 | 159.23 | 181.15 | 109.15 | 82.41 | 40.65 | 82.79 | 90.80 | 354.00 | 310.30 | 106.35 | 79.64 | 143.72 | 36.59 | 170.08 | 27.5210 | 240.46 | 64.63 | 79.46 | 29.65 | 95.80 | 85.3925 | 113.34 | 162.87 | 456.56 | 55.76 | 30.990 | 112.75 | 139.59 | 47.55 | 78.00 | 31.28 | 110.9575 | 23.98 | 58.06 | 91.98 | 172.14 | 63.21 | 34.39 | 70.53 | 423.48 | 101.27 | 243.50 | 154.58 | 91.64 | 236.1436 | 91.98 | 103.65 | 140.52 | 136.17 | 344.39 | 143.03 | 119.56 | 104.81 | 77.15 | 58.90 | 114.33 | 125.01 | 43.09 | 149.76 | 96.59 | 204.61 | 341.09 | 66.39 | 261.91 | 485.44 | 86.02 | 95.42 | 124.36 | 680.15 | 59.85 | 33.21 | 85.33 | 215.25 | 158.82 | 109.21 | 76.47 | 43.73 | 106.18 | 90.99 | 144.47 | 12.01 | 47.97 | 39.00 | 407.79 | 181.45 | 42.34 | 145.28 | 102.06 | 69.27 | 33.46 | 122.94 | 105.06 | 48.30 | 65.22 | 245.22 | 56.02554 | 21.87 | 85.50 | 83.89 | 120.01 | 34.11 | 34.69 | 88.08 | 171.94 | 103.48 | 95.54 | 347.70 | 554.03 | 38.43 | 62.11 | 14.36 | 248.88 | 319.73 | 136.57 | 76.82 | 127.84 | 78.23 | 210.04 | 16.42 | 27.22 | 45.89 | 82.39 | 16.17 | 223.27 | 229.30 | 492.12 | 142.60 | 103.93 | 422.43 | 229.27 | 108.16 | 79.08 | 27.68 | 391.57 | 36.83 | 34.70 | 207.78 | 50.88 | 270.00 | 334.15 | 220.82 | 18.98 | 126.01 | 174.98 | 65.80 | 177.19 | 180.13 | 31.83 | 136.38 | 66.41315 | 18.08 | 170.49 | 41.24 | 21.65 | 380.23 | 136.81 | 18.23 | 65.12 | 62.92 | 44.97 | 59.27 | 104.37 | 93.61 | 205.9278 | 54.59 | 363.10 | 476.50 | 55.79 | 201.39 | 422.46 | 320.98 | 33.62 | 51.89 | 85.90 | 355.15 | 157.74 | 149.86 | 48.73 | 264.39 | 70.79 | 374.01 | 284.11 | 66.73 | 80.97 | 73.19 | 127.37 | 35.54 | 113.30 | 84.96 | 347.74 | 169.21 | 104.6405 | 51.075 | 45.41 | 43.27 | 110.58 | 112.08 | 25.32 | 86.65 | 474.12 | 229.10 | 258.02 | 146.59 | 1491.57 | 54.20 | 12.46 | 61.65 | 237.75 | 84.49 | 51.51 | 309.41 | 27.78 | 121.21 | 526.45 | 393.85 | 32.09 | 250.90 | 61.93 | 91.68 | 137.00 | 14.749 | 4674.22 | 18.98 | 157.96 | 64.61 | 64.44 | 84.44 | 62.20 | 84.48 | 826.74 | 61.11 | 138.54 | 18.50 | 117.07 | 65.47334 | 15.88 | 62.51 | 178.97 | 50.13 | 85.01 | 152.23 | 131.24 | 296.89 | 47.41 | 130.80 | 116.96 | 101.23 | 163.73 | 46.70 | 75.39 | 129.97 | 29.93 | 101.18 | 278.11 | 100.68 | 121.97 | 138.55 | 77.69 | 111.53 | 91.00 | 52.64 | 63.69 | 726.49 | 21.91 | 105.54 | 113.16 | 210.90 | 260.53 | 36.94 | 442.74 | 117.97 | 128.25 | 142.31 | 292.63 | 104.46 | 83.35 | 244.18 | 158.19 | 51.70 | 232.32 | 320.93 | 72.51 | 121.16 | 341.62 | 77.520 | 98.71 | 79.48 | 52.49 | 231.16 | 78.60 | 93.48 | 33.10 | 252.49 | 76.68 | 19.14 | 51.36 | 633.56 | 117.62 | 87.42 | 257.50 | 152.72 | 79.86 | 561.92 | 139.96 | 39.65 | 71.03 | 50.80 | 111.36 | 187.23 | 221.49 | 113.64 | 64.51 | 174.31 | 169.17 | 71.06 | 39.72 | 39.25 | 145.77 | 483.52 | 504.50 | 209.24 | 177.11 | 366.01 | 46.03 | 212.61 | 119.62 | 179.57 | 180.25 | 205.56 | 286.83 | 47.03 | 41.13 | 101.23 | 349.77 | 37.49 | 33.05 | 94.78 | 68.84 | 42.65 | 156.91 | 32.35 | 47.92000 | 48.22667 | 246.49 | 31.23 | 90.90 | 70.67 | 106.82 | 110.59 | 128.68 | 128.46 | 148.96 |
| 2023-01-05 00:00:00+00:00 | 152.11 | 13.99 | 125.02 | 163.49 | 110.80 | 62.94 | 262.98 | 328.44 | 159.69 | 84.23 | 233.63 | 182.07 | 86.00 | 93.78 | 25.87 | 71.50 | 63.51 | 126.16 | 186.44 | 84.62 | 218.61 | 139.34 | 107.75 | 97.92 | 137.98 | 266.86 | 308.75 | 212.13 | 83.12 | 109.49 | 231.65 | 59.38 | 41.76 | 301.82 | 38.350 | 95.86 | 143.05 | 108.87 | 157.98 | 55.502 | 183.12 | 154.67 | 146.43 | 2416.57 | 204.99 | 34.07 | 52.84 | 52.57 | 46.79 | 81.51 | 255.31 | 27.33 | 93.18 | 271.59 | 430.37 | 47.21 | 2144.94 | 701.24 | 71.92 | 133.02 | 57.08 | 45.76 | 36.97183 | 76.59 | 64.18 | 46.75 | 39.97 | 76.62 | 240.28 | 223.56 | 123.77 | 78.43 | 140.75 | 8.95 | 156.80 | 178.06 | 110.86 | 83.78 | 40.47 | 81.98 | 90.01 | 361.43 | 304.19 | 106.67 | 78.54 | 142.08 | 36.98 | 169.75 | 27.2832 | 236.72 | 63.28 | 77.26 | 28.98 | 93.98 | 84.8400 | 116.77 | 163.08 | 450.19 | 56.14 | 30.275 | 109.67 | 136.34 | 46.88 | 74.90 | 30.91 | 108.3825 | 23.67 | 57.57 | 90.45 | 175.24 | 61.46 | 35.23 | 70.87 | 414.73 | 99.55 | 245.42 | 154.50 | 92.31 | 226.2677 | 91.92 | 98.08 | 142.56 | 134.42 | 338.83 | 144.30 | 116.58 | 103.26 | 78.50 | 60.17 | 111.69 | 124.68 | 43.10 | 144.83 | 95.29 | 198.10 | 342.14 | 64.89 | 261.17 | 473.00 | 85.69 | 95.01 | 124.38 | 660.62 | 57.47 | 32.08 | 83.75 | 205.88 | 157.20 | 103.66 | 75.18 | 42.78 | 104.92 | 93.00 | 141.04 | 12.25 | 46.30 | 39.84 | 393.55 | 181.28 | 41.68 | 140.05 | 99.40 | 68.19 | 33.13 | 121.70 | 102.72 | 47.45 | 63.93 | 244.62 | 56.89545 | 21.10 | 85.85 | 84.41 | 120.19 | 34.39 | 35.00 | 86.20 | 168.57 | 101.98 | 95.32 | 343.76 | 537.53 | 38.95 | 63.30 | 14.26 | 250.30 | 315.47 | 140.78 | 77.08 | 126.84 | 77.63 | 204.47 | 16.45 | 27.21 | 46.05 | 80.03 | 15.79 | 223.08 | 224.43 | 488.02 | 141.11 | 102.58 | 419.20 | 225.74 | 107.29 | 80.16 | 27.56 | 375.62 | 36.20 | 35.23 | 207.05 | 48.95 | 268.62 | 324.77 | 218.88 | 18.68 | 123.61 | 169.20 | 65.03 | 173.78 | 178.80 | 31.17 | 135.35 | 66.34742 | 17.84 | 168.87 | 41.33 | 20.97 | 373.47 | 135.54 | 18.21 | 63.34 | 62.20 | 45.16 | 59.21 | 104.01 | 93.78 | 206.1598 | 53.94 | 358.92 | 477.07 | 54.13 | 200.77 | 417.06 | 325.93 | 33.53 | 51.53 | 86.93 | 351.77 | 153.01 | 148.88 | 48.14 | 262.16 | 69.10 | 374.05 | 276.69 | 66.19 | 80.03 | 72.60 | 126.94 | 35.15 | 113.04 | 84.45 | 337.97 | 166.14 | 102.8094 | 50.190 | 45.37 | 44.69 | 115.73 | 113.64 | 25.87 | 85.92 | 459.12 | 222.31 | 255.12 | 146.24 | 1489.51 | 54.71 | 12.76 | 60.24 | 229.74 | 82.63 | 51.21 | 309.70 | 27.13 | 120.62 | 528.52 | 366.32 | 31.75 | 247.48 | 61.33 | 91.51 | 137.05 | 14.265 | 4639.53 | 18.68 | 153.51 | 63.39 | 64.83 | 85.12 | 59.74 | 84.31 | 834.29 | 62.22 | 135.11 | 18.66 | 114.36 | 65.18667 | 15.66 | 61.12 | 177.10 | 49.66 | 84.39 | 150.34 | 131.53 | 298.10 | 47.67 | 129.88 | 112.85 | 100.82 | 159.37 | 46.16 | 73.65 | 123.11 | 29.24 | 99.44 | 271.27 | 103.02 | 120.05 | 135.57 | 76.27 | 109.40 | 90.29 | 53.38 | 62.29 | 723.76 | 21.67 | 107.25 | 114.46 | 205.07 | 258.58 | 36.31 | 435.92 | 118.40 | 125.52 | 138.81 | 281.37 | 104.43 | 83.11 | 227.75 | 159.39 | 52.67 | 230.05 | 315.04 | 70.38 | 117.42 | 336.08 | 76.240 | 98.79 | 78.71 | 53.49 | 208.68 | 78.64 | 92.42 | 31.98 | 249.82 | 76.27 | 19.21 | 49.58 | 636.46 | 116.26 | 87.12 | 253.00 | 154.26 | 80.65 | 556.94 | 144.48 | 39.72 | 70.93 | 50.12 | 108.21 | 187.55 | 219.63 | 110.34 | 64.36 | 171.95 | 166.93 | 70.07 | 41.24 | 37.63 | 148.97 | 488.94 | 489.96 | 203.08 | 173.84 | 364.09 | 45.67 | 211.11 | 126.59 | 175.46 | 177.64 | 200.04 | 287.89 | 46.76 | 41.70 | 99.88 | 347.03 | 35.19 | 35.23 | 92.41 | 67.39 | 42.42 | 153.88 | 32.02 | 47.75667 | 47.92000 | 246.44 | 30.30 | 91.34 | 69.24 | 109.21 | 107.45 | 127.70 | 126.08 | 145.45 |
| 2023-01-06 00:00:00+00:00 | 147.67 | 14.18 | 129.62 | 166.55 | 112.33 | 63.90 | 269.21 | 332.75 | 165.52 | 85.95 | 240.16 | 187.19 | 88.32 | 96.53 | 26.43 | 73.84 | 64.55 | 128.59 | 192.02 | 85.85 | 221.64 | 141.37 | 111.42 | 104.27 | 143.84 | 275.20 | 321.28 | 218.50 | 86.08 | 112.45 | 239.41 | 61.44 | 42.89 | 310.18 | 39.285 | 98.31 | 147.60 | 112.53 | 163.63 | 58.843 | 188.31 | 158.26 | 150.17 | 2486.64 | 213.00 | 34.41 | 54.26 | 48.45 | 46.12 | 83.04 | 259.34 | 28.38 | 96.62 | 279.25 | 418.68 | 47.95 | 2179.25 | 738.00 | 73.00 | 137.10 | 58.83 | 46.33 | 37.98415 | 79.22 | 67.05 | 47.31 | 40.96 | 78.64 | 248.86 | 228.86 | 123.49 | 80.92 | 145.71 | 9.20 | 159.66 | 183.22 | 116.62 | 84.41 | 41.64 | 83.59 | 93.31 | 367.73 | 302.68 | 109.74 | 79.98 | 144.44 | 37.89 | 175.09 | 28.0412 | 243.90 | 64.44 | 78.66 | 29.99 | 97.07 | 86.2925 | 118.23 | 165.71 | 482.87 | 57.01 | 31.195 | 114.00 | 140.51 | 48.32 | 77.12 | 32.20 | 110.4850 | 24.26 | 59.75 | 91.60 | 176.56 | 61.87 | 36.03 | 72.47 | 426.47 | 103.04 | 248.56 | 155.64 | 93.58 | 223.8387 | 93.92 | 101.38 | 145.83 | 139.42 | 342.62 | 146.98 | 119.44 | 105.18 | 81.05 | 61.23 | 115.82 | 123.89 | 45.11 | 149.46 | 97.83 | 204.62 | 351.40 | 66.87 | 263.81 | 475.48 | 88.71 | 97.95 | 127.47 | 673.94 | 59.36 | 33.34 | 85.80 | 214.70 | 161.10 | 107.68 | 77.01 | 43.67 | 108.18 | 94.11 | 144.00 | 12.58 | 47.61 | 42.28 | 404.46 | 185.77 | 42.37 | 141.64 | 102.18 | 69.30 | 34.47 | 125.46 | 104.24 | 48.69 | 65.54 | 248.25 | 57.41421 | 21.50 | 88.08 | 85.96 | 123.37 | 35.38 | 35.91 | 87.34 | 170.84 | 105.17 | 98.13 | 348.08 | 552.73 | 40.21 | 64.86 | 14.60 | 254.52 | 317.53 | 143.70 | 78.90 | 130.25 | 77.83 | 210.27 | 17.11 | 28.36 | 46.91 | 81.61 | 16.17 | 225.65 | 228.87 | 492.54 | 143.70 | 104.90 | 447.77 | 232.14 | 110.01 | 80.87 | 28.73 | 386.52 | 36.64 | 35.70 | 205.36 | 50.69 | 270.39 | 328.62 | 226.95 | 19.17 | 128.51 | 176.04 | 67.15 | 178.30 | 180.25 | 31.95 | 137.94 | 68.02817 | 18.37 | 175.42 | 42.59 | 21.53 | 397.59 | 138.45 | 18.58 | 65.78 | 63.40 | 45.74 | 60.47 | 104.95 | 95.66 | 208.7801 | 55.33 | 362.94 | 473.24 | 55.35 | 200.97 | 445.27 | 329.26 | 35.08 | 51.57 | 90.10 | 367.67 | 158.31 | 153.38 | 49.66 | 269.47 | 72.77 | 386.05 | 289.70 | 68.02 | 80.86 | 73.58 | 130.02 | 37.20 | 114.23 | 86.30 | 351.17 | 170.96 | 105.9532 | 51.215 | 46.45 | 46.73 | 116.82 | 114.84 | 26.60 | 87.56 | 469.01 | 224.93 | 263.84 | 150.19 | 1449.57 | 56.77 | 13.04 | 61.65 | 235.77 | 83.65 | 52.69 | 315.55 | 27.61 | 124.53 | 521.42 | 366.53 | 32.20 | 255.79 | 63.65 | 94.67 | 143.93 | 14.859 | 4796.82 | 19.00 | 159.63 | 64.08 | 66.57 | 86.08 | 62.47 | 85.66 | 842.52 | 63.75 | 134.85 | 19.03 | 117.45 | 66.88000 | 15.89 | 62.64 | 181.10 | 50.92 | 87.46 | 153.92 | 134.61 | 308.55 | 48.31 | 132.19 | 116.65 | 103.49 | 164.21 | 46.80 | 75.46 | 128.50 | 29.96 | 101.01 | 276.76 | 105.70 | 122.69 | 139.12 | 76.48 | 115.34 | 93.35 | 54.99 | 63.59 | 737.15 | 22.39 | 110.18 | 115.97 | 208.89 | 267.52 | 36.53 | 443.67 | 121.57 | 128.36 | 133.53 | 291.45 | 106.69 | 84.54 | 235.80 | 162.59 | 54.50 | 237.41 | 320.54 | 71.63 | 118.87 | 347.84 | 77.270 | 102.69 | 80.68 | 55.91 | 214.23 | 81.07 | 95.44 | 32.80 | 256.25 | 78.71 | 19.53 | 50.88 | 649.03 | 120.40 | 91.24 | 257.93 | 160.15 | 82.72 | 535.00 | 148.57 | 41.05 | 73.10 | 51.50 | 112.29 | 193.87 | 221.64 | 113.06 | 66.07 | 178.50 | 175.16 | 71.85 | 42.24 | 38.70 | 151.96 | 486.19 | 490.00 | 212.01 | 178.95 | 377.34 | 46.31 | 217.75 | 127.56 | 181.36 | 182.88 | 205.31 | 290.21 | 47.82 | 42.19 | 101.54 | 322.21 | 36.61 | 37.17 | 94.77 | 69.78 | 42.80 | 159.49 | 32.56 | 48.92667 | 49.52667 | 253.43 | 31.61 | 93.57 | 71.29 | 110.53 | 111.27 | 130.29 | 125.70 | 147.64 |
| 2023-01-09 00:00:00+00:00 | 147.47 | 14.61 | 130.15 | 161.66 | 112.15 | 62.67 | 273.75 | 341.98 | 167.10 | 85.37 | 239.51 | 192.47 | 89.41 | 97.95 | 27.25 | 71.45 | 63.87 | 125.11 | 191.76 | 86.88 | 224.00 | 140.01 | 112.27 | 106.49 | 142.50 | 270.12 | 321.22 | 219.30 | 87.36 | 114.87 | 247.99 | 61.49 | 43.37 | 311.67 | 39.560 | 97.98 | 146.59 | 113.79 | 162.47 | 57.689 | 189.48 | 158.44 | 150.40 | 2441.75 | 208.57 | 33.89 | 55.57 | 44.70 | 45.00 | 82.06 | 252.45 | 28.51 | 98.04 | 274.72 | 425.99 | 48.14 | 2208.41 | 752.99 | 71.01 | 138.66 | 58.84 | 45.59 | 37.18310 | 80.58 | 65.72 | 47.54 | 40.42 | 77.57 | 246.62 | 223.51 | 122.18 | 81.76 | 145.38 | 9.47 | 165.92 | 183.96 | 116.80 | 86.00 | 41.10 | 82.25 | 93.36 | 367.18 | 303.04 | 108.69 | 78.73 | 142.66 | 37.55 | 175.80 | 28.9348 | 244.29 | 64.99 | 77.59 | 30.47 | 97.45 | 86.0625 | 117.36 | 164.97 | 478.75 | 55.95 | 31.220 | 113.01 | 147.10 | 48.58 | 77.08 | 32.14 | 110.3275 | 24.79 | 60.86 | 91.49 | 175.18 | 61.59 | 36.77 | 73.48 | 428.20 | 103.88 | 241.05 | 152.93 | 92.77 | 226.1082 | 94.77 | 101.14 | 143.97 | 138.91 | 333.88 | 148.10 | 120.46 | 105.19 | 78.71 | 62.20 | 110.06 | 124.17 | 43.52 | 150.22 | 97.72 | 204.64 | 352.69 | 67.79 | 261.63 | 470.71 | 88.56 | 97.26 | 127.75 | 681.16 | 58.60 | 34.67 | 85.71 | 212.57 | 160.88 | 108.46 | 76.35 | 44.15 | 109.85 | 93.77 | 147.30 | 12.69 | 47.95 | 42.83 | 404.62 | 189.74 | 42.40 | 143.40 | 102.07 | 70.00 | 34.22 | 128.97 | 102.90 | 49.41 | 65.81 | 245.21 | 57.99681 | 21.49 | 85.10 | 84.68 | 120.28 | 35.42 | 35.92 | 88.02 | 170.41 | 106.46 | 98.23 | 353.00 | 557.63 | 40.44 | 63.70 | 14.41 | 249.79 | 317.81 | 144.22 | 76.85 | 129.08 | 79.79 | 210.67 | 16.96 | 28.64 | 46.09 | 80.53 | 16.20 | 225.39 | 231.38 | 481.71 | 143.55 | 105.24 | 448.08 | 233.86 | 111.37 | 78.82 | 29.31 | 395.34 | 37.01 | 35.55 | 208.49 | 50.46 | 264.93 | 329.15 | 226.96 | 19.27 | 125.78 | 177.64 | 66.78 | 178.20 | 175.58 | 31.83 | 137.37 | 67.02348 | 18.19 | 176.61 | 42.20 | 21.09 | 408.33 | 137.02 | 18.71 | 67.07 | 62.61 | 46.19 | 58.97 | 100.33 | 95.29 | 209.1495 | 55.57 | 349.83 | 458.99 | 55.44 | 199.44 | 452.43 | 298.66 | 35.61 | 52.78 | 91.34 | 370.97 | 155.76 | 152.53 | 50.36 | 267.25 | 73.80 | 381.96 | 290.79 | 67.24 | 77.54 | 71.34 | 129.47 | 37.64 | 112.20 | 84.61 | 345.13 | 170.39 | 106.0117 | 50.575 | 45.98 | 47.20 | 115.76 | 110.38 | 27.03 | 87.64 | 480.54 | 227.12 | 260.61 | 148.55 | 1456.41 | 56.36 | 13.78 | 62.61 | 241.24 | 84.07 | 52.22 | 315.17 | 27.88 | 124.85 | 495.41 | 378.56 | 32.64 | 254.23 | 63.80 | 94.21 | 141.07 | 15.628 | 4763.17 | 19.07 | 160.97 | 64.70 | 67.69 | 85.70 | 64.65 | 86.42 | 828.71 | 64.08 | 135.08 | 19.55 | 117.41 | 66.52666 | 15.96 | 63.49 | 179.33 | 48.39 | 86.87 | 152.04 | 132.01 | 309.52 | 48.13 | 132.48 | 116.06 | 102.30 | 164.28 | 47.49 | 75.55 | 128.87 | 30.17 | 98.26 | 281.21 | 102.01 | 125.96 | 138.64 | 77.08 | 114.61 | 95.21 | 57.29 | 62.53 | 680.49 | 22.09 | 110.67 | 114.38 | 209.69 | 269.34 | 36.53 | 443.41 | 118.73 | 127.73 | 133.89 | 291.39 | 104.74 | 85.43 | 236.01 | 160.33 | 55.64 | 240.40 | 325.95 | 71.50 | 118.32 | 350.34 | 78.565 | 101.38 | 80.91 | 55.34 | 215.51 | 82.16 | 96.48 | 32.29 | 261.00 | 79.20 | 19.04 | 50.21 | 650.82 | 121.41 | 93.21 | 257.15 | 156.35 | 81.51 | 546.09 | 148.77 | 41.25 | 74.92 | 52.24 | 114.85 | 189.12 | 217.48 | 119.77 | 65.71 | 178.08 | 176.68 | 70.51 | 43.35 | 38.47 | 148.10 | 484.76 | 490.06 | 211.46 | 181.69 | 376.00 | 46.61 | 218.60 | 128.58 | 178.02 | 182.05 | 208.25 | 283.23 | 47.81 | 41.37 | 100.91 | 320.40 | 36.35 | 37.59 | 96.56 | 69.27 | 42.39 | 158.24 | 32.63 | 48.31667 | 48.65333 | 255.06 | 31.42 | 94.33 | 71.98 | 108.47 | 111.79 | 129.76 | 126.05 | 147.06 |
| 2023-01-10 00:00:00+00:00 | 155.23 | 15.19 | 130.73 | 159.64 | 113.85 | 63.07 | 274.93 | 338.70 | 169.22 | 86.23 | 240.78 | 193.43 | 89.12 | 97.28 | 27.23 | 72.00 | 63.45 | 126.52 | 192.68 | 87.39 | 223.90 | 141.26 | 112.04 | 108.04 | 142.81 | 273.88 | 319.77 | 221.95 | 89.87 | 115.23 | 248.40 | 59.86 | 43.24 | 312.67 | 39.650 | 99.64 | 147.66 | 114.30 | 162.99 | 57.493 | 189.73 | 160.77 | 152.00 | 2435.52 | 206.69 | 34.12 | 56.09 | 45.74 | 45.80 | 84.27 | 255.37 | 29.15 | 97.51 | 281.46 | 432.92 | 48.01 | 2220.95 | 757.28 | 71.65 | 139.03 | 59.87 | 46.06 | 37.91373 | 80.53 | 67.39 | 48.19 | 40.44 | 78.13 | 250.51 | 224.77 | 122.49 | 82.73 | 147.10 | 9.68 | 164.32 | 186.38 | 118.53 | 83.00 | 41.26 | 81.29 | 92.80 | 374.64 | 303.79 | 110.51 | 78.31 | 140.98 | 37.88 | 177.50 | 29.0286 | 247.72 | 64.89 | 76.12 | 30.56 | 98.50 | 85.9625 | 118.08 | 167.26 | 481.40 | 56.00 | 30.810 | 113.39 | 147.44 | 48.81 | 77.83 | 32.40 | 111.2425 | 25.08 | 61.21 | 90.65 | 176.04 | 61.83 | 38.09 | 73.92 | 437.43 | 105.61 | 241.76 | 152.54 | 93.58 | 236.6046 | 95.56 | 102.32 | 145.88 | 139.30 | 339.73 | 150.31 | 120.02 | 105.43 | 79.78 | 63.13 | 110.88 | 124.02 | 45.09 | 152.20 | 98.09 | 205.00 | 358.95 | 68.20 | 262.45 | 470.68 | 89.63 | 97.82 | 127.61 | 695.15 | 58.75 | 34.04 | 85.69 | 214.06 | 160.41 | 107.73 | 77.02 | 43.69 | 108.54 | 94.64 | 146.48 | 12.84 | 48.50 | 44.71 | 409.28 | 188.74 | 42.59 | 143.94 | 101.04 | 69.00 | 34.46 | 128.70 | 101.83 | 48.42 | 66.12 | 247.54 | 60.07183 | 21.60 | 87.14 | 84.05 | 121.53 | 35.84 | 37.11 | 88.42 | 169.75 | 105.31 | 98.92 | 357.36 | 563.92 | 40.94 | 64.48 | 14.47 | 250.40 | 320.63 | 143.62 | 76.20 | 129.46 | 80.57 | 211.34 | 17.06 | 29.09 | 45.91 | 79.97 | 16.45 | 224.78 | 234.75 | 481.59 | 144.80 | 106.65 | 469.11 | 234.80 | 115.99 | 79.55 | 29.44 | 388.68 | 37.62 | 35.69 | 215.51 | 50.16 | 271.36 | 331.99 | 227.48 | 19.52 | 127.50 | 175.38 | 67.31 | 178.37 | 175.16 | 31.83 | 138.60 | 66.43192 | 18.21 | 178.01 | 42.33 | 20.38 | 416.27 | 135.86 | 18.74 | 67.39 | 62.13 | 45.98 | 59.04 | 99.98 | 96.57 | 210.7302 | 56.17 | 352.75 | 462.29 | 54.97 | 201.40 | 458.61 | 309.48 | 36.21 | 53.80 | 91.68 | 371.01 | 154.72 | 153.16 | 50.81 | 268.91 | 75.45 | 382.67 | 297.25 | 67.31 | 78.89 | 71.57 | 132.99 | 38.85 | 115.15 | 84.37 | 343.96 | 170.39 | 107.1321 | 50.265 | 45.99 | 44.83 | 116.50 | 110.81 | 27.01 | 88.92 | 488.08 | 228.85 | 262.35 | 148.00 | 1520.86 | 57.21 | 14.14 | 62.54 | 241.45 | 84.54 | 53.15 | 327.54 | 27.56 | 125.84 | 495.79 | 380.84 | 32.86 | 254.65 | 64.21 | 94.84 | 151.16 | 15.909 | 4828.53 | 19.39 | 166.07 | 64.14 | 69.31 | 85.80 | 64.27 | 86.50 | 826.88 | 64.15 | 134.19 | 19.89 | 117.31 | 66.99333 | 15.94 | 63.67 | 177.85 | 47.62 | 87.62 | 151.89 | 133.02 | 310.50 | 48.43 | 132.73 | 116.05 | 102.19 | 164.94 | 47.98 | 75.65 | 129.78 | 30.17 | 99.35 | 281.18 | 101.57 | 127.40 | 139.51 | 77.92 | 116.91 | 96.18 | 58.93 | 62.03 | 698.95 | 22.03 | 110.97 | 114.08 | 212.68 | 271.66 | 36.50 | 443.46 | 119.79 | 125.17 | 133.40 | 293.83 | 106.00 | 85.62 | 234.10 | 159.02 | 55.96 | 244.42 | 327.63 | 71.11 | 116.85 | 354.39 | 78.755 | 106.86 | 81.46 | 56.46 | 220.00 | 84.03 | 97.51 | 32.84 | 263.81 | 79.96 | 19.47 | 50.86 | 658.80 | 122.70 | 94.78 | 259.48 | 155.97 | 82.44 | 568.03 | 149.56 | 41.94 | 74.39 | 52.50 | 117.20 | 191.22 | 221.74 | 118.85 | 65.58 | 178.93 | 178.40 | 70.84 | 45.75 | 38.59 | 147.50 | 488.62 | 486.00 | 212.28 | 178.77 | 381.47 | 46.66 | 221.09 | 132.01 | 176.75 | 183.54 | 208.00 | 288.75 | 47.79 | 41.95 | 101.24 | 335.93 | 36.70 | 38.17 | 96.72 | 68.99 | 42.36 | 154.91 | 32.14 | 48.28667 | 48.90000 | 253.38 | 30.86 | 97.47 | 72.11 | 110.09 | 112.94 | 129.40 | 125.50 | 154.36 |
Structure of the price_data
dataset:
str(price_data[,1:10 ])
## 'data.frame': 418 obs. of 10 variables:
## $ Date: chr "2023-01-03 00:00:00+00:00" "2023-01-04 00:00:00+00:00" "2023-01-05 00:00:00+00:00" "2023-01-06 00:00:00+00:00" ...
## $ A : num 150 152 152 148 147 ...
## $ AAL : num 12.7 13.6 14 14.2 14.6 ...
## $ AAPL: num 125 126 125 130 130 ...
## $ ABBV: num 162 164 163 167 162 ...
## $ ABT : num 110 111 111 112 112 ...
## $ ACGL: num 62.5 62.8 62.9 63.9 62.7 ...
## $ ACN : num 270 269 263 269 274 ...
## $ ADBE: num 337 341 328 333 342 ...
## $ ADI : num 162 166 160 166 167 ...
esg_data with price_data datasetTo explore the relationship between ESG scores and stock performance,
we need to merge the two original datasets: one containing ESG metrics
(esg_data) and the other containing stock price time series
data (price_data). Before performing the merge, the stock
price data required reshaping and summarization to align with the
structure of the esg_data.
Reshaping the Stock Price Timeseries Data The stock
price data is stored in a wide format, with each column representing the
stock prices of different companies over time. In order to perform
calculations and merge the data with ESG scores, the stock price data is
reshaped into a long format using the pivot_longer()
function from the tidyr package.
price_data_long <- price_data %>%
tidyr::pivot_longer(cols = -Date, names_to = "Symbol", values_to = "Stock_Price")
Grouping by Symbol to Calculate Stock Performance
Metrics After the stock price_data is reshaped,
key financial performance metrics can be calculated for each company.
This step is important as it enables us to summarize the time series
data into a single row for each company. Therefore, this step makes the
now reshaped price_data_long compatible with the ESG
dataset for merging. The group_by() function is used to
calculate these metrics per company using the Symbol for
each S&P 500 companies in the dataset.
# Grouping by 'Symbol' to calculate metrics
price_summary <- price_data_long %>%
group_by(Symbol) %>%
summarise(
avg_price = mean(Stock_Price, na.rm = TRUE),
total_return = (last(Stock_Price) - first(Stock_Price)) / first(Stock_Price) * 100,
avg_log_return = mean(log(Stock_Price / lag(Stock_Price)), na.rm = TRUE),
volatility = sd(Stock_Price, na.rm = TRUE),
max_drawdown = (min(Stock_Price, na.rm = TRUE) - max(Stock_Price, na.rm = TRUE)) / max(Stock_Price, na.rm = TRUE) * 100
)
Financial metrics calculated from the price_data
dataset
- avg_price: The average stock price over
the entire time series, providing a basic measure of central tendency.
- total_return: The total return on the stock,
calculated as the percentage change from the first to the last stock
price in the time series. This metric is critical for understanding the
overall performance of the stock over time.
-
avg_log_return: The average log return of the stock,
calculated as the mean of the logarithmic daily returns. Log returns are
often used in financial analyses because they are additive over time,
making them useful for cumulative return calculations.
-
volatility: The standard deviation of the stock prices over
time, which measures the stock’s variability or risk. High volatility
typically indicates more uncertainty in stock performance.
-
max_drawdown: The maximum drawdown represents the worst
peak-to-trough decline in stock price during the time period, expressed
as a percentage. This is an important risk metric, as it shows the
potential loss an investor could have experienced by holding the stock.
Inspecting the price_summary
dataset
my_kable(head(price_summary))
| Symbol | avg_price | total_return | avg_log_return | volatility | max_drawdown |
|---|---|---|---|---|---|
| A | 131.89409 | -4.745398 | -0.0001166 | 12.759917 | -36.39507 |
| AAL | 13.90959 | -16.640502 | -0.0004365 | 2.036392 | -50.74468 |
| AAPL | 181.32703 | 83.097466 | 0.0014505 | 21.542943 | -46.75922 |
| ABBV | 157.90842 | 20.895425 | 0.0004550 | 15.009143 | -32.92331 |
| ABT | 106.36593 | 3.367398 | 0.0000794 | 6.049293 | -25.43816 |
| ACGL | 82.05794 | 81.059875 | 0.0014236 | 11.895713 | -45.83075 |
Merging the esg_data and
price_summary datasets:
After transforming
the price_data timeseries into the
price_summary dataset, we can merge the summary with the
ESG data to create a combined dataset which is suitable for analyzing
the relationship between ESG data and stock performance. The
egs_stock_combined dataset is exported into a CSV file with
row names set to the Symbol column.
esg_stock_combined <- merge(esg_data, price_summary, by = "Symbol")
row.names(esg_stock_combined) <- esg_stock_combined$Symbol
esg_stock_combined <- esg_stock_combined %>%
select(-Symbol)
write.csv(esg_stock_combined, "/Users/todomonkos/Documents/University/Year 4/Data Wrangling/Assignments/Final Project/esg_stock_combined.csv", row.names = TRUE)
Inspecting the combined dataset:
my_kable(head(esg_stock_combined))
| Full.Name | GICS.Sector | GICS.Sub.Industry | environmentScore | socialScore | governanceScore | totalEsg | highestControversy | percentile | ratingYear | ratingMonth | marketCap | beta | overallRisk | avg_price | total_return | avg_log_return | volatility | max_drawdown | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A | Agilent Technologies | Health Care | Life Sciences Tools & Services | 1.12 | 6.42 | 6.10 | 13.64 | 2 | 7.98 | 2023 | 9 | 3.975183e+10 | 1.054 | 8 | 131.89409 | -4.745398 | -0.0001166 | 12.759917 | -36.39507 |
| AAL | American Airlines Group | Industrials | Passenger Airlines | 9.94 | 11.65 | 4.76 | 26.35 | 2 | 54.48 | 2023 | 9 | 7.335384e+09 | 1.433 | 10 | 13.90959 | -16.640502 | -0.0004365 | 2.036392 | -50.74468 |
| AAPL | Apple Inc. | Information Technology | Technology Hardware, Storage & Peripherals | 0.46 | 7.39 | 9.37 | 17.22 | 3 | 17.82 | 2023 | 9 | 3.296097e+12 | 1.240 | 1 | 181.32703 | 83.097466 | 0.0014505 | 21.542943 | -46.75922 |
| ABBV | AbbVie | Health Care | Biotechnology | 2.38 | 17.19 | 10.36 | 29.93 | 3 | 68.62 | 2023 | 9 | 3.416984e+11 | 0.619 | 6 | 157.90842 | 20.895425 | 0.0004550 | 15.009143 | -32.92331 |
| ABT | Abbott Laboratories | Health Care | Health Care Equipment | 2.27 | 14.24 | 8.33 | 24.83 | 3 | 48.27 | 2023 | 9 | 2.055344e+11 | 0.722 | 7 | 106.36593 | 3.367398 | 0.0000794 | 6.049293 | -25.43816 |
| ACGL | Arch Capital Group | Financials | Property & Casualty Insurance | 1.47 | 10.10 | 10.89 | 22.46 | 2 | 37.87 | 2023 | 9 | 4.285557e+10 | 0.596 | 5 | 82.05794 | 81.059875 | 0.0014236 | 11.895713 | -45.83075 |
Inspecting the structure of the combined dataset:
str(esg_stock_combined)
## 'data.frame': 426 obs. of 19 variables:
## $ Full.Name : chr "Agilent Technologies" "American Airlines Group" "Apple Inc." "AbbVie" ...
## $ GICS.Sector : chr "Health Care" "Industrials" "Information Technology" "Health Care" ...
## $ GICS.Sub.Industry : chr "Life Sciences Tools & Services" "Passenger Airlines" "Technology Hardware, Storage & Peripherals" "Biotechnology" ...
## $ environmentScore : num 1.12 9.94 0.46 2.38 2.27 ...
## $ socialScore : num 6.42 11.65 7.39 17.19 14.24 ...
## $ governanceScore : num 6.1 4.76 9.37 10.36 8.33 ...
## $ totalEsg : num 13.6 26.4 17.2 29.9 24.8 ...
## $ highestControversy: Ord.factor w/ 11 levels "0"<"1"<"2"<"3"<..: 3 3 4 4 4 3 3 3 2 4 ...
## $ percentile : num 7.98 54.48 17.82 68.62 48.27 ...
## $ ratingYear : num 2023 2023 2023 2023 2023 ...
## $ ratingMonth : num 9 9 9 9 9 9 9 9 9 9 ...
## $ marketCap : num 3.98e+10 7.34e+09 3.30e+12 3.42e+11 2.06e+11 ...
## $ beta : num 1.054 1.433 1.24 0.619 0.722 ...
## $ overallRisk : Ord.factor w/ 10 levels "1"<"2"<"3"<"4"<..: 8 10 1 6 7 5 2 1 9 4 ...
## $ avg_price : num 131.9 13.9 181.3 157.9 106.4 ...
## $ total_return : num -4.75 -16.64 83.1 20.9 3.37 ...
## $ avg_log_return : num -1.17e-04 -4.36e-04 1.45e-03 4.55e-04 7.94e-05 ...
## $ volatility : num 12.76 2.04 21.54 15.01 6.05 ...
## $ max_drawdown : num -36.4 -50.7 -46.8 -32.9 -25.4 ...
From the output, we can discern that the dataset comprises 426 entries spanning across 19 distinct variables:
column_classes_combined <- lapply(esg_stock_combined, class)
classify_variable_combined <- function(class) {
if (class %in% c("factor", "character")) {
return("categorical")
} else if (class == "integer") {
return("discrete")
} else if (class %in% c("numeric", "double")) {
return("continuous")
} else {
return("other")
}
}
variable_types <- sapply(column_classes_combined, function(cls) classify_variable_combined(cls[[1]]))
# List of explanations for each variable
variable_explanations <- c(
"Full name of the company",
"Sector the company operates in",
"Sub-industry classification",
"Environmental score of the company",
"Social score of the company",
"Governance score of the company",
"Total ESG score",
"Highest controversy rating associated with the company",
"Percentile rank of ESG score",
"Year of ESG rating",
"Month of ESG rating",
"Market capitalization of the company",
"Beta coefficient of the company stock",
"Overall risk level of the company",
"Average stock price over the time period",
"Total stock return over the time period",
"Average log return of the stock",
"Volatility of the stock prices",
"Maximum drawdown experienced by the stock"
)
# Creating a data frame for the variable types and explanations
variable_summary <- data.frame(
Variable = names(variable_types),
Type = variable_types,
Description = variable_explanations,
stringsAsFactors = FALSE,
row.names = NULL
)
my_kable(variable_summary)
| Variable | Type | Description |
|---|---|---|
| Full.Name | categorical | Full name of the company |
| GICS.Sector | categorical | Sector the company operates in |
| GICS.Sub.Industry | categorical | Sub-industry classification |
| environmentScore | continuous | Environmental score of the company |
| socialScore | continuous | Social score of the company |
| governanceScore | continuous | Governance score of the company |
| totalEsg | continuous | Total ESG score |
| highestControversy | other | Highest controversy rating associated with the company |
| percentile | continuous | Percentile rank of ESG score |
| ratingYear | continuous | Year of ESG rating |
| ratingMonth | continuous | Month of ESG rating |
| marketCap | continuous | Market capitalization of the company |
| beta | continuous | Beta coefficient of the company stock |
| overallRisk | other | Overall risk level of the company |
| avg_price | continuous | Average stock price over the time period |
| total_return | continuous | Total stock return over the time period |
| avg_log_return | continuous | Average log return of the stock |
| volatility | continuous | Volatility of the stock prices |
| max_drawdown | continuous | Maximum drawdown experienced by the stock |
Handling missing values In order to ensure the
dataset is complete and does not have any missing values, we first check
for missing data using the is.na() function.
sum(is.na(esg_stock_combined))
## [1] 0
The result was 0, indicating that there are no missing values in the dataset. As such, no imputation or further handling of missing data is required for this analysis.
Checking for outliers
Outliers in the data can
skew analysis results, so it is important to check for and handle them
accordingly. A summary of key financial performance variables, namely
total_return, volatility, and
max_drawdown, was generated to get an overview of the
data.
summary(select(esg_stock_combined, total_return, volatility, max_drawdown))
## total_return volatility max_drawdown
## Min. :-75.13 Min. : 0.8719 Min. :-89.48
## 1st Qu.: 2.36 1st Qu.: 6.0424 1st Qu.:-45.19
## Median : 20.74 Median : 11.9230 Median :-37.27
## Mean : 30.79 Mean : 25.1004 Mean :-38.68
## 3rd Qu.: 47.42 3rd Qu.: 24.3180 3rd Qu.:-29.87
## Max. :733.88 Max. :1079.0913 Max. :-18.14
From the summary, we can observe that there are some extreme values,
particularly in total_return and volatility,
which may be considered outliers. To further investigate, we made
boxplots for each of the variables:
p1 <- ggplot(esg_stock_combined, aes(x = "", y = total_return)) + geom_boxplot() + ggtitle("Total Return") + theme(plot.title = element_text(hjust = 0.5, size = 10)) + theme(plot.margin = unit(c(1,1,1,1), "cm"))
p2 <- ggplot(esg_stock_combined, aes(x = "", y = volatility)) + geom_boxplot() + ggtitle("Volatility") + theme(plot.title = element_text(hjust = 0.5, size = 10)) + theme(plot.margin = unit(c(1,1,1,1), "cm"))
p3 <- ggplot(esg_stock_combined, aes(x = "", y = max_drawdown)) + geom_boxplot() + ggtitle("Max Drawdown") + theme(plot.title = element_text(hjust = 0.5, size = 10)) + theme(plot.margin = unit(c(1,1,1,1), "cm"))
gridExtra::grid.arrange(p1, p2, p3, nrow = 1)
These boxplots visually highlight the presence of outliers, particularly
in the
total_return and volatility variables,
where there are extreme values beyond the typical range.
We
focuse here specifically on total_return,
volatility, and max_drawdown because these
variables are fundamental indicators of financial performance and risk,
particularly for companies in the context of investment analysis. Other
variables, such as environmental, social, and governance scores (ESG),
provide insight into a company’s sustainability practices but they do
not necessarily reflect directly the financial outcomes. Thus, these
three variables were chosen because they directly influence investor
decision-making and are critical for evaluating financial returns and
associated risks:
- Extreme values in total_return,
such as very high gains or large losses, can distort overall trends and
affect model accuracy. Winsorization caps these extremes, ensuring that
the analysis remains focused on typical performance without being skewed
by rare outliers.
- volatility often contains extreme
spikes due to market shocks or speculative movements. These outliers can
mislead the analysis by overemphasizing rare events. Winsorization
reduces the impact of these extreme values.
Large
`max_drawdown are often the result of extraordinary events,
such as market crashes or company-specific crises. While important to
consider, these extreme cases can overly influence the overall analysis.
Winsorization helps limit the effect of such extremes.
Apply Winsorization to limit extreme values at 5% and
95%
To handle the identified outliers, Winsorization was
applied to the financial variables (total_return,
volatility, and max_drawdown). Winsorization
reduces the influence of extreme values by limiting them to the 5th and
95th percentiles. This approach ensures that extreme outliers are capped
without being entirely removed from the dataset.
data_cleaned <- esg_stock_combined %>%
mutate(total_return = winsor(total_return, trim = 0.05),
volatility = winsor(volatility, trim = 0.05),
max_drawdown = winsor(max_drawdown, trim = 0.05))
summary(select(data_cleaned, total_return, volatility, max_drawdown))
## total_return volatility max_drawdown
## Min. :-25.59 Min. : 2.566 Min. :-61.49
## 1st Qu.: 2.36 1st Qu.: 6.042 1st Qu.:-45.19
## Median : 20.74 Median :11.923 Median :-37.27
## Mean : 27.73 Mean :19.237 Mean :-38.43
## 3rd Qu.: 47.42 3rd Qu.:24.318 3rd Qu.:-29.87
## Max. :106.70 Max. :76.453 Max. :-22.02
From this output, we can observe that extreme values have been
reduced compared to the previous summary, improving the dataset’s
distribution.
Visualizing the data after the
Winsorization
Here, we create boxplots for the Winsorized
data to visually compare the distributions before and after
Winsorization:
p4 <- ggplot(data_cleaned, aes(x = "", y = total_return)) + geom_boxplot() + ggtitle("Total Return After Winsorization") + theme(plot.title = element_text(hjust = 0.5, size = 10)) + theme(plot.margin = unit(c(1,1,1,1), "cm"))
p5 <- ggplot(data_cleaned, aes(x = "", y = volatility)) + geom_boxplot() + ggtitle("Volatility After Winsorization") + theme(plot.title = element_text(hjust = 0.5, size = 10)) + theme(plot.margin = unit(c(1,1,1,1), "cm"))
p6 <- ggplot(data_cleaned, aes(x = "", y = max_drawdown)) + geom_boxplot() + ggtitle("Max Drawdown After Winsorization") + theme(plot.title = element_text(hjust = 0.5, size = 10)) + theme(plot.margin = unit(c(1,1,1,1), "cm"))
gridExtra::grid.arrange(p4, p5, p6, nrow = 1)
These plots show that Winsorization improved distribution of the
variables. The extreme values seen in the earlier boxplots have now been
capped, leading to a more normalized range of values for each financial
performance variable.
With the data cleaned, we have now a more refined and reliable version of the data. This cleaning process was essential in making sure that the dataset is suitable for further analysis, reducing the risk of skewed results due to anomalies or inaccuracies by dealing with outliers and inconsistencies.
With the data prepared, we can now move on to the Exploratory Data
Analysis (EDA) phase of the project. In this part, we focus on gaining
deeper understanding of the dataset’s structure and relationships to
uncover potential trends or patterns. Through visualizations and summary
statistics, we will explore the relationships between the key variables,
particularly the ESG scores and the target variable
total_return. This exploration helps us to spot any
interesting patterns in the data and set the foundation for the modeling
phase, where we dig deeper into how ESG scores might impact financial
performance.
total_returnTo better understand the distribution of the
total_return variable, a histogram was created with a
normal distribution curve:
ggplot(data_cleaned, aes(x = total_return)) +
geom_histogram(aes(y = ..density..), bins = 30, color = "black", fill = "lightblue") +
stat_function(fun = dnorm, args = list(mean = mean(data_cleaned$total_return),
sd = sd(data_cleaned$total_return)),
color = "red", size = 1) +
labs(title = "Histogram of Total Return with Normal Curve",
x = "Total Return", y = "Density")
Interpretation
The histogram reveals that the
distribution of total_return is somewhat right-skewed, with
most values clustered around 20 to 40. This skewness indicates that
while the majority of companies have low or negative returns, there are
some companies which exhibit positive returns.
By overlaying a
normal curve (red line), we can compare the actual distribution of the
total_return to a normal distribution. The normal curve in
red does not perfectly align with the histogram, indicating that the
data does not follow a normal distribution. However, the curve shows
that many of the observations are concentrated near the mean value of
the distribution. This highlights that the data may require some
transformation for modeling purposes later on, as many statistical
models assume normality.
marketCap variableTo better understand the distribution of the marketCap
variable, a histogram was created with a normal distribution curve:
ggplot(data_cleaned, aes(x = marketCap)) +
geom_histogram(aes(y = ..density..), bins = 30, color = "black", fill = "lightblue") +
stat_function(fun = dnorm, args = list(mean = mean(data_cleaned$marketCap),
sd = sd(data_cleaned$marketCap)),
color = "red", size = 1) +
labs(title = "Histogram of Market Capitalization",
x = "Market Capitalization", y = "Density")
Now that we have covered the target variable, we shift our focus to
the ESG variables: environmentScore,
socialScore, governanceScore. These components
of the analysis are important for the analysis, as they represent a
company’s commitment to sustainable practices across different
areas.
To begin with, we will look at the individual distributions of each ESG component. This helps us to understand the range of scores across companies and whether any component has more variation than the others.
esg_plot_1 <- ggplot(data_cleaned, aes(x = environmentScore)) +
geom_histogram(binwidth = 1, fill = "#021E36", color = "white") +
labs(title = "Distribution of Environment Score", x = "Environment Score", y = "Count") +
theme_minimal(base_family = "Josefin Sans")
esg_plot_2 <- ggplot(data_cleaned, aes(x = socialScore)) +
geom_histogram(binwidth = 1, fill = "#021E36", color = "white") +
labs(title = "Distribution of Social Score", x = "Social Score", y = "Count") +
theme_minimal(base_family = "Josefin Sans")
esg_plot_3 <- ggplot(data_cleaned, aes(x = governanceScore)) +
geom_histogram(binwidth = 1, fill = "#021E36", color = "white") +
labs(title = "Distribution of Governance Score", x = "Governance Score", y = "Count") +
theme_minimal(base_family = "Josefin Sans")
gridExtra::grid.arrange(esg_plot_1, esg_plot_2, esg_plot_3, nrow = 1)
Interpretation: - Environmental Score: Skewed to the
right, with most companies scoring low, suggesting that high
environmental performance is less common. - Social Score: More evenly
distributed, resembling a normal distribution, indicating consistent
performance across companies in social responsibility. - Governance
Score: Also right-skewed, with most companies scoring in the mid-range,
showing some variation but fewer companies with extremely high
governance practices.
After seeing the distribution of different ESG variables, we move on
to see how the ESG scores differ across GICS sectors. The
GICS.Sector variable in the dataset categorizes companies
based on their industry sectors, providing valuable context for
understanding how ESG scores might vary across different types of
businesses. Different sectors may prioritize environmental, social, or
governance factors differently depending on the nature of their
operations. For example, companies in sectors like Energy or Utilities
may face greater environmental scrutiny, while sectors such as
Financials might place more emphasis on governance practices.
By analyzing the distribution of ESG scores across different GICS sectors, we can gain insights into how sector-specific challenges and priorities influence a company’s performance in these areas. This exploration will help us to determine whether certain sectors are more prone to excelling in specific ESG components.
sector_summary <- data_cleaned %>%
group_by(GICS.Sector) %>%
summarize(Count = n())
ggplot(sector_summary, aes(x = reorder(GICS.Sector, -Count), y = Count, fill = GICS.Sector)) +
geom_bar(stat = "identity") +
labs(title = "Number of Companies by GICS Sector",
x = "GICS Sector",
y = "Number of Variables") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
On the plot each bar represents a different sector, giving us a clear
overview of how the dataset is distributed across industries. It seems
that
Financials has the most representation in the dataset,
with over 60 companies.
Industrials and Information Technology are also highly
represented, with close to 60 companies each. This suggests that these
sectors might have a larger influence on the overall trends in the data,
particularly in terms of ESG scores and financial performance.
ggplot(data_cleaned, aes(x = GICS.Sector, y = environmentScore, fill = GICS.Sector)) +
geom_boxplot() +
labs(title = "Distribution of Environment Score by GICS Sector", x = "GICS Sector", y = "Environment Score") +
theme_minimal(base_family = "Josefin Sans") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
The plot highlights how environmental performance differs greatly by
sector, with Energy, Materials, and Consumer Staples sectors leading in
environmental scores, while Financials, Health Care, and Communication
Services tend to score lower. This variation likely reflects the
different levels of environmental risk and regulatory scrutiny that each
sector faces. Companies in sectors with higher environmental impacts,
like Energy, tend to be under more pressure to improve their
sustainability practices, thus, leading to higher scores. However, it is
important to highlight that the high score does not mean that their
greenhouse gas emission levels are low, rather it is usually on the
contrary.
ggplot(data_cleaned, aes(x = GICS.Sector, y = socialScore, fill = GICS.Sector)) +
geom_boxplot() +
labs(title = "Distribution of Social Score by GICS Sector", x = "GICS Sector", y = "Social Score") +
theme_minimal(base_family = "Josefin Sans") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
This plot shows how the distribution of social scores varies by sector,
with Health Care and Industrials leading in social performance, while
sectors like Energy and Real Estate tend to have lower scores. This
likely reflects the different social challenges and priorities faced by
companies across industries, where sectors with more direct societal
impacts, such as Health Care, tend to perform better in social
responsibility metrics.
ggplot(data_cleaned, aes(x = GICS.Sector, y = governanceScore, fill = GICS.Sector)) +
geom_boxplot() +
labs(title = "Distribution of Governance Score by GICS Sector", x = "GICS Sector", y = "Governance Score") +
theme_minimal(base_family = "Josefin Sans") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
This plot shows that sectors such as Financials and Health Care lead in
governance scores, likely due to the high level of scrutiny they face
from regulators and stakeholders. In contrast, sectors like Utilities,
Real Estate, and Materials show lower governance performance, reflecting
less emphasis on governance practices in these industries. This
distribution highlights the differing priorities and regulatory
environments across industries when it comes to corporate
governance.
ggplot(data_cleaned, aes(x = GICS.Sector, y = totalEsg, fill = GICS.Sector)) +
geom_boxplot() +
labs(title = "Distribution of Total ESG Score by GICS Sector", x = "GICS Sector", y = "Total ESG Score") +
theme_minimal(base_family = "Josefin Sans") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Finally, this plot highlights that the Energy and Materials sectors lead
in total ESG scores, reflecting the intense focus on sustainability
within these industries, while sectors like Financials, Real Estate, and
Health Care show lower overall ESG performance. This variation
highlights the differing levels of emphasis on ESG practices across
industries, with certain sectors facing more pressure to perform in
these areas.
cor_matrix <- cor(data_cleaned[, c("environmentScore", "socialScore", "governanceScore", "totalEsg", "total_return", "marketCap", "volatility")], use = "complete.obs")
ggcorrplot(cor_matrix, method = "circle", lab = TRUE)
The correlation matrix highlights that while the individual ESG scores
are highly correlated with the totalEsg score which is self-explanatory,
their relationships with financial metrics like total_return, marketCap,
and volatility are quite weak and sometimes negative. This suggests that
ESG performance, at least in a linear context, does not have a strong
direct impact on financial outcomes like returns or market
capitalization. Further analysis might provide deeper insights into how
ESG scores could influence financial performance in specific GICS
sectors or under certain conditions.
Thus, in the scatter plot down below, we will try to display the relationship between total ESG score and total return, with each GICS sector represented by different colors and corresponding linear regression lines.
ggplot(data_cleaned, aes(x = totalEsg, y = total_return, color = GICS.Sector)) +
geom_point(alpha = 0.7) +
geom_smooth(method = "lm", se = FALSE) +
labs(title = "Relationship between Total ESG Score and Total Return by GICS Sector",
x = "Total ESG Score", y = "Total Return")
This plot show a quite varied relationship across the sector. Some
sectors show a positive relationship, while others show a negative or
flat relationship. In cases with positive relationship, such as Consumer
Discretionary, Consumer Staples, and Utilities exhibit positive trends,
suggesting that in these sectors, companies with higher ESG scores tend
to have higher total returns. On the other hand, Information Technology,
Energy, and Financials show a negative or flat trend, where an increase
in ESG scores does not appear to lead to higher total returns, and may
even correspond with lower returns in some cases. This suggests that,
for these sectors, ESG initiatives might not be directly driving
financial performance, or that other factors could be more
influential.
However, there is a high degree of scatter within each sector, indicating that even within sectors, the relationship between total ESG score and total return is not perfectly linear. The dispersion of points suggests that other factors (e.g., company size, market conditions) might also play a role in determining total returns. Therefore, it is crucial to further analyse the relationship of ESG scores and financial metrics to see their intricate play, helping to clarify whether these sector-specific trends hold statistically and what other factors might influence the relationship between ESG scores and returns.
In this section we partition the esg_stock_combined
dataset randomly into two groups: train_set (80%) and
test_set (20%). We partition the dataset in order to
prepare it for modelling in the subsequent parts of the project.
For partitioning the dataset, we use the partition()
function from the liver package. Note that we are also
using the set.seed() function for the sake of creating
reproducible results.
set.seed(123)
data_sets = partition(data = data_cleaned, prob = c(0.8, 0.2))
train_set = data_sets$part1
test_set = data_sets$part2
In order to validate the partitioning of the dataset
esg_stock_combined, we conduct a Welch Two-Sample t-test to
statistically compare the means of total_return between the
training and test sets. However, it is important to remember that the
t-test assumes that the distribution of the data is normal.
\[ H_0: \mu_{\text{train}} = \mu_{\text{test}} \]
\[ H_1: \mu_{\text{train}} \neq \mu_{\text{test}} \]
t.test(train_set$total_return, test_set$total_return)
##
## Welch Two Sample t-test
##
## data: train_set$total_return and test_set$total_return
## t = -0.62572, df = 117.04, p-value = 0.5327
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -11.744907 6.105155
## sample estimates:
## mean of x mean of y
## 27.20419 30.02406
Statistical interpretation: The result of the t-test
showed a p-value of 0.5327, which is well above the typical
significance level of 0.05. This means that we fail to reject the null
hypothesis, which means that there is no statistically significant
difference between the means of total_return in the
training and test sets.
Non-statistical interpretation:
Thus, the
t-test provides enough evidence that the partitioning has not introduced
any bias in terms of the mean total_return between the
training and test sets. This ensures that the split is fair, and both
sets are likely representative of the overall data.
We use a density plot in order to confirm the findings of the Welch Two-Sample t-test.
train_set$Set <- "Train"
test_set$Set <- "Test"
combined_data <- rbind(train_set, test_set)
ggplot(combined_data, aes(x = total_return, fill = Set)) +
geom_density(alpha = 0.4) +
labs(title = "Density Plot of Total Return (Train vs Test)", x = "Total Return", y = "Density")
Interpretation: The plot showed no significant
difference between the means of total_return in the
training and test sets. The density plot shows that the distributions of
total_return for both the training and test sets are very
similar, with a high degree of overlap between the two curves,
confirming that the partitioning of the data has maintained consistency
in the distribution of the target variable. This alignment between the
t-test result and the density plot gives us confidence that the
partitioning is balanced and suitable for model development.
During the earlier distribution analysis of total_return
and marketCap, we observed positive skewness and potential
heteroscedasticity in both variables. To stabilize the variance and
improve linear modeling assumptions, we apply a log transformation.
Since the total_return variable contains zero or near-zero
values, we use a log(x + 1) transformation to avoid undefined values.
Similarly, marketCap is also log-transformed to reduce
scale effects and skewness. These transformations aim to improve the
linear model’s fit and satisfy the assumptions of homoscedasticity and
normality of residuals.
epsilon <- 0.01
train_set$log_total_return <- log(train_set$total_return + epsilon)
test_set$log_total_return <- log(test_set$total_return + epsilon)
train_set$log_marketCap <- log(train_set$marketCap + epsilon)
test_set$log_marketCap <- log(test_set$marketCap + epsilon)
train_set$log_percentile <- log(train_set$percentile + epsilon)
test_set$log_percentile <- log(test_set$percentile + epsilon)
Checking the distribution for normality of Logarithmic Return in the train and test set
library(patchwork)
plot1 <- ggplot(train_set, aes(x = log_total_return)) +
geom_histogram(aes(y = ..density..), bins = 30, color = "black", fill = "lightblue") +
labs(title = "Histogram of Logarithmic Return in Train Set",
x = "Log Return", y = "Density")
plot2 <- ggplot(test_set, aes(x = log_total_return)) +
geom_histogram(aes(y = ..density..), bins = 30, color = "black", fill = "lightblue") +
labs(title = "Histogram of Logarithmic Return in Test Set",
x = "Log Return", y = "Density")
plot1 + plot2
data_cleaned$GICS.Sector <- as.factor(data_cleaned$GICS.Sector)
Modelling the relationship between ESG scores and
total_return is complicated since the correlation matrix
have highlighted above that the individual ESG scores have a really weak
and sometimes negative relationship with financial metrics like
total_return, marketCap, and volatility. Therefore, in the following
modelling variables with weak correlation to total_return are also
utilized, such as environmentScore, to answer the question
of how useful are these ESG scores in driving sustainable growth and
investor confidence.
formula = log_total_return ~ environmentScore + socialScore + governanceScore + totalEsg + highestControversy + log_percentile + ratingYear + log_marketCap + GICS.Sector
First, we start with a simple linear model which helps to establish a
clear baseline for understanding the basic relationships between the
dependent variable total_return and key independent
variables defined in the formula above. This allows us to determine if
any variables have a significant impact on total return without the
complexity. Moreover, this model is easier to interpret, making them
ideal for building a foundation before moving on to more advanced
techniques.
model_simple_linear = lm(formula, data = train_set)
summary(model_simple_linear)
##
## Call:
## lm(formula = formula, data = train_set)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0285 -0.5004 0.0743 0.7115 2.5555
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 305.11060 619.44536 0.493 0.622774
## environmentScore -10.25245 10.07613 -1.017 0.309932
## socialScore -10.24876 10.07603 -1.017 0.310101
## governanceScore -10.21542 10.07212 -1.014 0.311487
## totalEsg 10.22204 10.07906 1.014 0.311507
## highestControversy.L 0.07213 0.56797 0.127 0.899041
## highestControversy.Q -0.13467 0.45250 -0.298 0.766256
## highestControversy.C -0.07766 0.38253 -0.203 0.839300
## highestControversy^4 -0.26862 0.29842 -0.900 0.368931
## highestControversy^5 -0.17648 0.18462 -0.956 0.340074
## log_percentile 0.30141 0.23052 1.308 0.192275
## ratingYear -0.15315 0.30635 -0.500 0.617581
## log_marketCap 0.25918 0.07140 3.630 0.000346 ***
## GICS.SectorConsumer Discretionary 1.38348 0.41615 3.325 0.001023 **
## GICS.SectorConsumer Staples 0.28676 0.45768 0.627 0.531549
## GICS.SectorEnergy 0.20327 0.58346 0.348 0.727846
## GICS.SectorFinancials 0.78324 0.40725 1.923 0.055623 .
## GICS.SectorHealth Care 0.67850 0.40593 1.671 0.095923 .
## GICS.SectorIndustrials 1.19007 0.39863 2.985 0.003122 **
## GICS.SectorInformation Technology 1.56966 0.39835 3.940 0.000106 ***
## GICS.SectorMaterials 0.80839 0.52452 1.541 0.124572
## GICS.SectorReal Estate 1.15655 0.47567 2.431 0.015768 *
## GICS.SectorUtilities 0.31836 0.50004 0.637 0.524943
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.016 on 242 degrees of freedom
## (81 observations deleted due to missingness)
## Multiple R-squared: 0.2269, Adjusted R-squared: 0.1566
## F-statistic: 3.228 on 22 and 242 DF, p-value: 4.278e-06
Interpretation of the simple linear regression The
results indicate that none of the individual ESG components show a
statistically significant relationship with total_return in
this model. This might suggests that ESG performance alone may not
directly drive financial outcomes for companies in this dataset.
However, market capitalization is positively and significantly
associated with total return, meaning that larger companies generally
achieve higher financial performance. Additionally, the percentile
ranking has a significant negative relationship with total return,
indicating that companies ranked higher in ESG performance tend to have
slightly lower returns. Overall, the Adjusted R-squared is 0.1582
meaning that the model explains only about 15.82% of the variation in
total_return, indicating that this simple model might need
some refinement but it provides a baseline.
Before refining the model, it is also important to check for
multicollinearity among the independent variables. Since the individual
ESG components (environmentScore, socialScore,
governanceScore) and the aggregate totalEsg
score are expected to be highly correlated, we calculate the Variance
Inflation Factor (VIF) to detect redundancy. A VIF value above 5 or 10
typically indicates problematic multicollinearity that can distort the
estimation of regression coefficients.
if(!require(car)) install.packages("car")
library(car)
vif_model <- lm(formula, data = train_set)
vif(vif_model)
## GVIF Df GVIF^(1/(2*Df))
## environmentScore 6.251144e+05 1 790.641791
## socialScore 3.419582e+05 1 584.771961
## governanceScore 1.344686e+05 1 366.699563
## totalEsg 1.194705e+06 1 1093.025491
## highestControversy 2.931120e+00 5 1.113534
## log_percentile 1.139971e+01 1 3.376346
## ratingYear 1.164337e+00 1 1.079044
## log_marketCap 1.579451e+00 1 1.256762
## GICS.Sector 2.014389e+01 10 1.162003
The results indicate extremely high multicollinearity, with VIF
values exceeding 800 for the individual ESG scores and over 1100 for
totalEsg. This confirms that these variables convey
overlapping information. As a result, totalEsg should be
excluded from further modeling to reduce redundancy and improve model
stability. Individual ESG components are preserved for interpretability
and allow for more detailed analysis of their effects on financial
performance.
As a result, we update the formula:
formula_refined = log_total_return ~ environmentScore + socialScore + governanceScore + log_percentile + log_marketCap + GICS.Sector
Check to see if multicollinearity is improved:
vif_model_updated = lm(formula_refined, data = train_set)
vif(vif_model_updated)
## GVIF Df GVIF^(1/(2*Df))
## environmentScore 6.612204 1 2.571421
## socialScore 4.544369 1 2.131752
## governanceScore 3.516014 1 1.875104
## log_percentile 9.777910 1 3.126965
## log_marketCap 1.233872 1 1.110798
## GICS.Sector 12.699610 10 1.135506
model_refined_test = lm(formula_refined, data = train_set)
plot(model_refined_test)
Interpretation
- Linearity:
The
Residuals vs. Fitted plot shows that the residuals are scattered fairly
symmetrically around the horizontal axis, with no strong curvature or
systematic pattern. This suggests that the relationship between the
predictors and the log-transformed total return is approximately linear,
satisfying the linearity assumption of the model.
-
Independence:
There is no discernible structure or
clustering in the residuals over the range of fitted values, which would
indicate autocorrelation or dependency. This pattern supports the
assumption that the residuals are independent and not influenced by
serial or grouped effects.
- Normality:
The Q-Q plot
shows that most standardized residuals align closely with the
theoretical quantiles, though a few observations in both tails deviate
from the line. This indicates a roughly normal distribution of
residuals, with mild tail deviations that are generally acceptable given
the sample size and context.
- Constant Variance:
The
Scale-Location plot reveals some variation in the spread of standardized
residuals, with slightly higher variance at lower fitted values and a
slight downward trend. This suggests the presence of mild
heteroscedasticity, which may not invalidate the model but could affect
the efficiency of coefficient estimates and standard errors.
Based on the diagnostic plots above, there are clear signs that suggest we should explore possible improvements to our model fit and address non-linearity and potential heteroscedasticity. Nonlinear regression could potentially solve these issues because it allows more flexibility in capturing relationships between the independent variables and the dependent variable.
formula_refined_polynomial <- log_total_return ~
poly(environmentScore, 2) +
poly(socialScore, 2) +
poly(governanceScore, 2) +
log_percentile +
log_marketCap +
GICS.Sector
polynomial_model <- lm(formula = formula_refined_polynomial, data = train_set)
summary(polynomial_model)
##
## Call:
## lm(formula = formula_refined_polynomial, data = train_set)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.1215 -0.4843 0.0677 0.6748 2.3946
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.96951 1.95665 -3.051 0.002532 **
## poly(environmentScore, 2)1 -7.39015 3.68848 -2.004 0.046212 *
## poly(environmentScore, 2)2 1.57030 2.04548 0.768 0.443406
## poly(socialScore, 2)1 -6.63654 3.27581 -2.026 0.043852 *
## poly(socialScore, 2)2 4.68576 1.78630 2.623 0.009256 **
## poly(governanceScore, 2)1 -1.88938 2.50104 -0.755 0.450708
## poly(governanceScore, 2)2 0.34517 1.18190 0.292 0.770499
## log_percentile 0.71622 0.30891 2.318 0.021243 *
## log_marketCap 0.24387 0.06216 3.924 0.000113 ***
## GICS.SectorConsumer Discretionary 1.38471 0.41469 3.339 0.000971 ***
## GICS.SectorConsumer Staples 0.25207 0.45408 0.555 0.579313
## GICS.SectorEnergy 0.17551 0.60462 0.290 0.771846
## GICS.SectorFinancials 0.73695 0.39659 1.858 0.064332 .
## GICS.SectorHealth Care 0.56517 0.39708 1.423 0.155915
## GICS.SectorIndustrials 1.16504 0.39763 2.930 0.003709 **
## GICS.SectorInformation Technology 1.48207 0.38777 3.822 0.000168 ***
## GICS.SectorMaterials 0.47929 0.53198 0.901 0.368493
## GICS.SectorReal Estate 0.79679 0.48473 1.644 0.101496
## GICS.SectorUtilities 0.36331 0.49297 0.737 0.461836
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9999 on 246 degrees of freedom
## (81 observations deleted due to missingness)
## Multiple R-squared: 0.2394, Adjusted R-squared: 0.1837
## F-statistic: 4.301 on 18 and 246 DF, p-value: 5.62e-08
Interpretation In this model, we edited the formula of the stepwise regression in order to address the above mentioned problems with non-linearity and potential heteroscedasticity. Since the ESG scores are the core of this analysis and understanding their relationship with financial return are the main goal, we changed only the ESG variables to polynomial terms. This allows us to better capture potential non-linear relationships between the environmental, social, and governance scores and financial performance, without overly complicating the model by introducing non-linearities in other variables like marketCap or percentile.
This refined polynomial model offers an improvement over the simpler
linear model, by capturing non-linear relationships between the ESG
scores and total_return. However, the overall fit remains
relatively low with an Adjusted R-squared score of 0.1846, indicating
that other factors not included in the model may play a significant role
in explaining financial performance.
As we saw above in the exploratory data analysis section that the relationship between ESG scores and total_return was varying across different GICS sectors. Therefore, here we include interaction terms between the ESG scores and the GICS sector. This will allow us to investigate whether the impact of ESG scores on total return varies by sector.
refined_model_with_gics <- lm(
log_total_return ~
environmentScore * GICS.Sector +
socialScore * GICS.Sector +
governanceScore * GICS.Sector +
log_percentile +
log_marketCap,
data = train_set
)
summary(refined_model_with_gics)
##
## Call:
## lm(formula = log_total_return ~ environmentScore * GICS.Sector +
## socialScore * GICS.Sector + governanceScore * GICS.Sector +
## log_percentile + log_marketCap, data = train_set)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9758 -0.4174 0.0601 0.5248 1.8099
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) -6.455455 2.473324 -2.610
## environmentScore -0.411846 0.198587 -2.074
## GICS.SectorConsumer Discretionary 2.682741 2.193905 1.223
## GICS.SectorConsumer Staples -1.649883 2.703444 -0.610
## GICS.SectorEnergy 1.691985 3.291319 0.514
## GICS.SectorFinancials 3.003766 2.202192 1.364
## GICS.SectorHealth Care 0.827567 2.328877 0.355
## GICS.SectorIndustrials 1.553054 2.212556 0.702
## GICS.SectorInformation Technology 4.284056 2.261286 1.895
## GICS.SectorMaterials 2.808725 2.989329 0.940
## GICS.SectorReal Estate 3.777403 2.713089 1.392
## GICS.SectorUtilities -2.363735 3.867831 -0.611
## socialScore 0.030877 0.201116 0.154
## governanceScore 0.156568 0.177805 0.881
## log_percentile 0.454975 0.297546 1.529
## log_marketCap 0.273359 0.063931 4.276
## environmentScore:GICS.SectorConsumer Discretionary 0.436459 0.200836 2.173
## environmentScore:GICS.SectorConsumer Staples 0.474688 0.221090 2.147
## environmentScore:GICS.SectorEnergy 0.737726 0.236356 3.121
## environmentScore:GICS.SectorFinancials 0.178092 0.236929 0.752
## environmentScore:GICS.SectorHealth Care 0.480022 0.255244 1.881
## environmentScore:GICS.SectorIndustrials 0.363058 0.198349 1.830
## environmentScore:GICS.SectorInformation Technology 0.327840 0.197524 1.660
## environmentScore:GICS.SectorMaterials 0.409426 0.214075 1.913
## environmentScore:GICS.SectorReal Estate 0.444564 0.350949 1.267
## environmentScore:GICS.SectorUtilities 0.550294 0.212974 2.584
## GICS.SectorConsumer Discretionary:socialScore -0.212557 0.215594 -0.986
## GICS.SectorConsumer Staples:socialScore -0.068244 0.224985 -0.303
## GICS.SectorEnergy:socialScore 0.104489 0.308674 0.339
## GICS.SectorFinancials:socialScore -0.179849 0.206145 -0.872
## GICS.SectorHealth Care:socialScore -0.018995 0.204389 -0.093
## GICS.SectorIndustrials:socialScore -0.112786 0.202873 -0.556
## GICS.SectorInformation Technology:socialScore -0.142916 0.210245 -0.680
## GICS.SectorMaterials:socialScore -0.270132 0.262268 -1.030
## GICS.SectorReal Estate:socialScore -0.241422 0.335666 -0.719
## GICS.SectorUtilities:socialScore 0.060475 0.225508 0.268
## GICS.SectorConsumer Discretionary:governanceScore 0.004797 0.221475 0.022
## GICS.SectorConsumer Staples:governanceScore 0.222794 0.354735 0.628
## GICS.SectorEnergy:governanceScore -1.507085 0.395397 -3.812
## GICS.SectorFinancials:governanceScore -0.131085 0.186627 -0.702
## GICS.SectorHealth Care:governanceScore -0.129796 0.238626 -0.544
## GICS.SectorIndustrials:governanceScore 0.048321 0.212495 0.227
## GICS.SectorInformation Technology:governanceScore -0.311063 0.223480 -1.392
## GICS.SectorMaterials:governanceScore -0.199872 0.553884 -0.361
## GICS.SectorReal Estate:governanceScore -0.334216 0.322516 -1.036
## GICS.SectorUtilities:governanceScore -0.092076 0.566157 -0.163
## Pr(>|t|)
## (Intercept) 0.009678 **
## environmentScore 0.039259 *
## GICS.SectorConsumer Discretionary 0.222714
## GICS.SectorConsumer Staples 0.542303
## GICS.SectorEnergy 0.607718
## GICS.SectorFinancials 0.173971
## GICS.SectorHealth Care 0.722669
## GICS.SectorIndustrials 0.483470
## GICS.SectorInformation Technology 0.059473 .
## GICS.SectorMaterials 0.348466
## GICS.SectorReal Estate 0.165247
## GICS.SectorUtilities 0.541749
## socialScore 0.878121
## governanceScore 0.379520
## log_percentile 0.127685
## log_marketCap 2.84e-05 ***
## environmentScore:GICS.SectorConsumer Discretionary 0.030838 *
## environmentScore:GICS.SectorConsumer Staples 0.032890 *
## environmentScore:GICS.SectorEnergy 0.002044 **
## environmentScore:GICS.SectorFinancials 0.453057
## environmentScore:GICS.SectorHealth Care 0.061348 .
## environmentScore:GICS.SectorIndustrials 0.068549 .
## environmentScore:GICS.SectorInformation Technology 0.098397 .
## environmentScore:GICS.SectorMaterials 0.057112 .
## environmentScore:GICS.SectorReal Estate 0.206590
## environmentScore:GICS.SectorUtilities 0.010420 *
## GICS.SectorConsumer Discretionary:socialScore 0.325264
## GICS.SectorConsumer Staples:socialScore 0.761928
## GICS.SectorEnergy:socialScore 0.735304
## GICS.SectorFinancials:socialScore 0.383923
## GICS.SectorHealth Care:socialScore 0.926039
## GICS.SectorIndustrials:socialScore 0.578818
## GICS.SectorInformation Technology:socialScore 0.497374
## GICS.SectorMaterials:socialScore 0.304152
## GICS.SectorReal Estate:socialScore 0.472765
## GICS.SectorUtilities:socialScore 0.788819
## GICS.SectorConsumer Discretionary:governanceScore 0.982741
## GICS.SectorConsumer Staples:governanceScore 0.530621
## GICS.SectorEnergy:governanceScore 0.000179 ***
## GICS.SectorFinancials:governanceScore 0.483180
## GICS.SectorHealth Care:governanceScore 0.587044
## GICS.SectorIndustrials:governanceScore 0.820326
## GICS.SectorInformation Technology:governanceScore 0.165363
## GICS.SectorMaterials:governanceScore 0.718555
## GICS.SectorReal Estate:governanceScore 0.301216
## GICS.SectorUtilities:governanceScore 0.870957
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9508 on 219 degrees of freedom
## (81 observations deleted due to missingness)
## Multiple R-squared: 0.3877, Adjusted R-squared: 0.2619
## F-statistic: 3.082 on 45 and 219 DF, p-value: 2.246e-08
if(!require(caret)) install.packages("caret")
if(!require(randomForest)) install.packages("randomForest")
if(!require(rpart)) install.packages("rpart")
library(caret)
library(randomForest)
library(rpart)
set.seed(123)
# Include relevant variables and convert GICS.Sector to factor
data_model <- train_set[, c("log_total_return", "environmentScore", "socialScore", "governanceScore",
"highestControversy", "log_percentile", "log_marketCap", "GICS.Sector")]
data_model <- na.omit(data_model)
ctrl <- trainControl(method = "cv", number = 10)
model_tree <- train(
log_total_return ~ .,
data = data_model,
method = "rpart",
trControl = ctrl
)
model_rf <- train(
log_total_return ~ .,
data = data_model,
method = "rf",
trControl = ctrl,
ntree = 500,
importance = TRUE
)
train_set <- na.omit(train_set)
model_simple_linear_cv = train(
formula,
data = train_set,
method = "lm",
trControl = ctrl,
)
model_gics_interaction_cv <- train(
log_total_return ~
environmentScore * GICS.Sector +
socialScore * GICS.Sector +
governanceScore * GICS.Sector +
log_percentile +
log_marketCap,
data = train_set,
method = "lm",
trControl = ctrl,
)
results <- resamples(list("Simple Linear Regression" = model_simple_linear_cv,
"Linear Model with Interaction Terms" = model_gics_interaction_cv,
"Decision Tree" = model_tree,
"Random Forest" = model_rf))
summary(results)
##
## Call:
## summary.resamples(object = results)
##
## Models: Simple Linear Regression, Linear Model with Interaction Terms, Decision Tree, Random Forest
## Number of resamples: 10
##
## MAE
## Min. 1st Qu. Median Mean
## Simple Linear Regression 0.6277472 0.7805803 0.8088434 0.8008553
## Linear Model with Interaction Terms 0.6443244 0.6959940 0.7514016 0.7901103
## Decision Tree 0.7391373 0.8407061 0.8728185 0.8557857
## Random Forest 0.5667923 0.6680921 0.8056157 0.7802555
## 3rd Qu. Max. NA's
## Simple Linear Regression 0.8372539 0.9474822 0
## Linear Model with Interaction Terms 0.9049162 0.9532036 0
## Decision Tree 0.8922238 0.9379919 0
## Random Forest 0.8799720 0.9768578 0
##
## RMSE
## Min. 1st Qu. Median Mean
## Simple Linear Regression 0.7102955 1.0003512 1.077962 1.038802
## Linear Model with Interaction Terms 0.8074164 0.8984089 1.075218 1.036226
## Decision Tree 0.8733669 1.0802861 1.120179 1.109342
## Random Forest 0.6934338 0.9653139 1.094258 1.035677
## 3rd Qu. Max. NA's
## Simple Linear Regression 1.142252 1.233396 0
## Linear Model with Interaction Terms 1.141242 1.261835 0
## Decision Tree 1.171708 1.290895 0
## Random Forest 1.112067 1.275043 0
##
## Rsquared
## Min. 1st Qu. Median
## Simple Linear Regression 3.176991e-02 0.054239728 0.099058910
## Linear Model with Interaction Terms 1.047807e-02 0.101023032 0.259474551
## Decision Tree 4.775521e-07 0.001578222 0.006112182
## Random Forest 3.718654e-02 0.051274039 0.087046384
## Mean 3rd Qu. Max. NA's
## Simple Linear Regression 0.14500672 0.20531741 0.4057080 0
## Linear Model with Interaction Terms 0.22839171 0.31626675 0.4823153 0
## Decision Tree 0.01192204 0.01974722 0.0346907 3
## Random Forest 0.16565037 0.26892284 0.4351761 0
# Generate predictions
pred_lm_simple <- predict(model_simple_linear_cv, newdata = test_set)
pred_lm_interact <- predict(model_gics_interaction_cv, newdata = test_set)
pred_tree <- predict(model_tree, newdata = test_set)
pred_rf <- predict(model_rf, newdata = test_set)
eval_df <- data.frame(
actual = test_set$log_total_return,
pred_lm_simple = pred_lm_simple,
pred_lm_interact = pred_lm_interact,
pred_tree = pred_tree,
pred_rf = pred_rf
)
# Remove rows with any NA
eval_df <- na.omit(eval_df)
# View results
eval_results <- rbind(
"Simple Linear Regression" = postResample(pred = eval_df$pred_lm_simple, obs = eval_df$actual),
"Linear Model with Interaction Terms" = postResample(pred = eval_df$pred_lm_interact, obs = eval_df$actual),
"Decision Tree" = postResample(pred = eval_df$pred_tree, obs = eval_df$actual),
"Random Forest" = postResample(pred = eval_df$pred_rf, obs = eval_df$actual)
)
print(eval_results)
## RMSE Rsquared MAE
## Simple Linear Regression 0.8774661 0.2173015 0.6842831
## Linear Model with Interaction Terms 1.0807527 0.1144627 0.7810884
## Decision Tree 0.9849939 NA 0.8048484
## Random Forest 0.9508921 0.1172154 0.7739230
ggplot(data = eval_df, aes(x = actual, y = pred_lm_simple)) +
geom_point(alpha = 0.5, color = "#021E36") +
geom_abline(slope = 1, intercept = 0, color = "red", linetype = "dashed") +
labs(
title = "Predicted vs Actual Log Total Return",
x = "Actual Log Total Return",
y = "Predicted Log Total Return"
)
eval_df$residuals <- eval_df$actual - eval_df$pred_lm_simple
ggplot(eval_df, aes(x = pred_lm_simple, y = residuals)) +
geom_point(alpha = 0.5) +
geom_hline(yintercept = 0, color = "red", linetype = "dashed") +
labs(
title = "Residuals vs Predicted Values",
x = "Predicted Log Total Return",
y = "Residuals"
)
The goal of this research project was to assess whether ESG scores can meaningfully explain or predict financial performance, particularly total stock return, across companies listed in the S&P 500 index. Motivated by growing debates on the financial materiality of ESG metrics and the emergence of stricter reporting frameworks like the EU’s Corporate Sustainability Reporting Directive (CSRD), this project sought to determine whether ESG scores still offer useful signals to investors, or whether more detailed, standardized disclosures are necessary.
Following exploratory analysis and several regression models, the simple linear regression model emerged as the most robust and generalizable. When evaluated on the test set, it yielded the lowest Root Mean Squared Error (RMSE = 0.878), lowest Mean Absolute Error (MAE = 0.683), and highest R² (0.216), outperforming more complex models such as those with ESG-sector interaction terms, decision trees, and random forests. This result is particularly notable because it suggests that simpler, more interpretable models may offer better generalization to unseen data than more flexible but complex alternatives.
However, while the linear model performed best among those tested, its explanatory power remains modest—explaining only 21.6% of the variance in log-transformed total return. This supports the conclusion drawn from earlier correlation and residual diagnostics: ESG scores, at least in their current aggregated form, may not serve as strong standalone predictors of financial performance across all sectors. The weak and varying relationships observed across sectors indicate that ESG relevance is likely contingent on industry-specific factors and possibly firm-level heterogeneity.
In deployment, the model can serve as a lightweight diagnostic or benchmarking tool for analysts assessing the financial relevance of ESG metrics. Nonetheless, the limited predictive strength also underscores the need for more granular ESG data and possibly more context-sensitive modeling approaches in future research. Overall, the findings offer cautious support for ESG-based evaluation, while pointing toward the limitations of treating ESG scores as universally predictive.
Ademi, B., & Klungseth, N. J. (2022). Does it pay to deliver superior ESG performance? Evidence from US S&P 500 companies. Journal of Global Responsibility, 13(4), 421–449. https://doi.org/10.1108/jgr-01-2022-0006 (https://doi.org/10.1108/jgr-01-2022-0006) Corporate sustainability reporting. (n.d.). Finance. https://finance.ec.europa.eu/capital-markets-union-and-financial-markets/company-reporting-and-auditing/company-reporting/corporate-sustainability-reporting_en (https://finance.ec.europa.eu/capital-markets-union-and-financial-markets/company-reporting-and-auditing/company-reporting/corporate-sustainability-reporting_en) Darolles, S., He, Y., & Fol, G. L. (2023). Understanding the effect of ESG scores on stock returns using mediation theory. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4634699 (https://doi.org/10.2139/ssrn.4634699)